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(54) Drug targets In Candida albicans 



(57) Nucleic acid molecules encoding 

polypeptides that are critical for sunrival and growth 
of the yeast Candida albicans are disclosed. Also 
provided are methods of identilying compounds which 
selectively modulate expression or activity of such 
polypeptides comprising the steps of (a) contacting a 
compound to be tested with one or more Candida 
albicans cells having a mutation in a nucleic acid 



molecule according to the invention which mutation 
results in overexpression or underexpression of said 
polypeptides in addition to contacting one or more v\flld 
type Candida albicans cells v^rtth said compound, and (b) 
monitoring the growth and/or activity of said mutated 
cell compared to said wild type; wherein differential 
growth or activity of said one or more mutated Candida 
cells is indicative of selective action of said 
compound on a polypeptide or another polypeptide in the 
same or a parallel pathway. 
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□ No claims fees have been paid within the prescribed time limit The present European search report has 
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did not invite payment of any additional fee. 

□ Only part of the further search fees have been paid within the fixed time limit. The present European 
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European Patent INVENTION ^^""^ ' 

Offle* SHEETS 



The Search Division considers that the present European patent appltoation does not oomply with the 
requirements of unity of Invention and relates to several Inventions or groups of Inventions, namely: 

1. Claims: Invention 1: claims 1.4-7,9.11-20.30.31 partially 

Nucleic acid molecule comprising seq.ID.l or capable of 
hybridizing thereto, expression vector comprising said 
nucleic acid, use of said vector for preparation of 
medicament or pharmaceutical composition, C. albicans cell 
comprising an induced mutation in said ONA sequence, 
oligonucleotides comprising 10-50 nt of said nucleic acid 
sequence, and method for identifying compounds which 
modulate expression of said nucleic acid. 



2. Claims: Inventions 2-41: claims 1,4-7,9,11-20,30. 
31 partially, and 2»3.8,10,32. 
33 partially as applicable 



As invention 1, but limited to the respective nucleic acid 
sequences 2.3.5,6.8,9,10,11,13.15.16.18.20.21,23,25,26,27, 
28.29.31,35.37,39,41,43,45.47.49.51.53.55,57.59.61,63,65,67. 
69, and 71. and polypeptide sequences corresponding to said 
nucleic acid sequences in as far as they are provided, 
whereby invention 2 is limited to seq-ID.2, invention 3 is 
limited to seq.ID.3 and its translated polypeptide seq.ID.4. 

and invention 41 is limited to seq.ID.71 and its 

translated polypeptide sequence seq.ID.72. 

In as far as a polypeptide sequence, translated from the ORF 
of a corresponding nucleic acid sequence is provived. the 
polypeptide encoded by the corresponding nucleic acid 
sequence and their use in the preparation of a medicament, 
and antibodies against said polypeptide is also considered 
part of the respective invention. 



3. Claims: Invention 42: claim 25-29 

Method for identifying ONA sequences from a cell or 
organism, which encode polypeptides which are critical for 
growth and survival for said cell or organism, comprising 
screening a library of nucleic acids using a vector that 
either integrates into the genome of said cell or organism, 
or that permits expression of anti sense RNA. and selecting 
growth- impaired cells or organisms. Plasmids p6ALlPSiST-l 
and pGALlPMiST-1, used in said method. 
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Claim(s) not searched: 
21-24 

Reason for the limitation of the search: 

Claims 21-24 refer to a compound identifiable with a method, without 
giving a true technical characteization of the compound. Moreover, no 
such compounds are defined in the application. In consequence, the scope 
of said claims is ambiguous and vague, and their subject-matter is not 
sufficiently disclosed and supported (Art. 83 and 84 EPC). 
No search can be carried out for such purely speculative claims whose 
wording is, in fact, a mere recitation of the results to be achieved. 
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Description 

[0001] The present invention is concerned with the identification of genes or functional fragments thereof 
from Candida albicans which are critical for growth and cell division and which genes may be used ^s^^^^^^^^^ drug 
targets to treat Candida albicans associated infections. Novel nucleic acid sequences from Candida albicans are also 

^ provided and which encode the polypeptides which are critical for growth of Candida albicans, 

[00021 Opportunistic infections In immunocompromised hosts represent an increasingly common cause of mortality 
and morbidity. Candida species are among the most commonly identified furigal pathogens associated ^Jh such 
opportunistic infections. withCanc//da albicans being the most common species. Such ^""S^l "jf^^^^^ 
problematical in. for example. AIDS populations in addition to normal healthy women where Candida albicans yeasts 

10 representthe most common cause of vulvovaginitis. . . *u a 

[0003] Although compounds do exist for treating such disorders, such as for example, amphotericin, these drugs 
are generally limited in their treatment because of their toxicity and side effects. Therefore, there exists a need 
for new compounds which may be used to treat Candida associated infections in addition to compounds which are 
selective in their action againstCand/da albicans. ... • u u;* 

[0004] Classical approaches for identifying anti-fungal compounds have relied almost exclusively on inhibition 
of funaal or yeast growth as an endpoint Libraries of natural products, semi-synthetic, or synthetic chemicals are 
screened for their ability to kill or arrest growth of the target pathogen or a related nonpathogenic model organism^ 
These tests are cumbersome and provide no information about a compounds mechanism of action^ The P[0[[|;sjng lead 
compounds that emerge from such screens must then be tested for possible host-toxicrty and detailed mechanism of 
action studies must subsequentiy be conducted to identify the affected molecular target. 

[0005] The present inventors have now identified a range of nucleic acid sequences from Candida albicans which 
encode polypeptides which are critical for its sun/ival and growth. These sequences represent novel targets which 
can be incorporated into an assay to selectively identify compounds capable of inhibiting expression of such 
polypeptides and their potential use in alleviating diseases or conditions associated v^ith Candida albicans infection. 
[0006] Therefore, according to a first aspect of the invention there is provided a nucleic acid rnolecule 
encoding a polypeptide which is critical for survival and growth of the yeast Candida albicans and which nucleic acjd 
25 molecule comprises any of the sequences of nucleotides in Sequence ID Numbers 1 to 3 5. 6. 8 to 11, 13. 15. ib. le. 
20 21 23 25 to 29 31 35, 37, 39. 41. 43. 45. 47. 49. 51. 53, 55, 57. 59, 61. 63, 65. 67. 69 and 71. 
[0007] ' A further aspect of the invention comprises a nucleic acid molecule encoding a polypeptide which is 
critical for survival and growth of the yeast Candida albicans and which nucleic acid nriolecule comprises any of the 
sequences of Sequence ID Numbers 1 . 28. 35. 37 and 39 and fragments or derivatives of said nucleic acid molecute. 
[0008] Also provided by the present invention is a nucleic acid molecule encoding a polypeptide which is 
critical for sun^ival and growth of the yeast Candidaalbicans and which polypeptide ha ^^Tfo^i 
according to the sequence of any of Sequence ID Numbers 4. 7. 12. 14. 17. 19, 22, 24. 30, 32 to 34, 36, 38, 40. 42. 
44. 46. 48, 50. 52. 54. 56, 58, 60. 62. 64. 66. 68. 70 and 72. 
[0009] Letters utilised in the nucleic acid sequences according to the invention which are not recognisable as 
letters of the genetic code signify a position in the nucleic acid sequence where one or more of bases A. G. C or T 
35 can occupy the nucleotide position. Representative letters used to identify the range of bases which can be used are 
as follov^: 

AorC 
Aor G 
AorT 
C orG 
CorT 
GorT 
A or C or G 
AorC or T 
A or GorT 
Cor GorT 
G or A or Tore 



30 



M 
R 

W 

S: 

40 Y 
K; 
V; 
H; 
D 
B 
N 



45 



[0010] In one embodiment of the above identified aspects of the invention the nucleic acid may comprise a mRNA 
molecule or alternatively a DMA and preferably a cDNA molecule. 

[0011] Also provided by the present invention is a nucleic acid molecule capable of hybndising to the nucleic 
50 add molecules according to the invention under high stringency conditions. 

[0012] Stringency of hybridisation as used herein refers to conditions under which polynucleic acids are stable. 
The stability of hybrids is reflected in the melting temperature (Tm) of the hybrids. Tm can be approximated by the 
formula: 

81.5*»C+16.6{1og io[Na*l+0.41 (%G&C)-6001/I 
55 wherein I is the length of the hybrids in nucleotides. Tm decreases approximately by 1-1.5'C with every 1% decrease 
in sequence homology. ' ^. . *u • « ..--n 

[0013] The nucleic acid capable of hybridising to nucleic acid molecules according to the invention will 
generally be at least 70%. preferably at least 80 or 90% and mor preferably at least 95% homologous to the 
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nucleotide sequences according to the invention. 

[0014] The DNA molecules according to the invention may. advantageously, be included in a suitable expression 
vector to express polypeptides encoded therefrom in a suitable host. 

[0015] The present invention also comprises within its scope proteins or polypeptides encoded by the nucleic 
acid molecules according to the invention or a functional equivalent, derivative or bioprecursor thereof. 

^ rOOiei Therefore according to a further aspect of the invention there is provided a polypeptide having an amino 
acid sequence of any of Sequence ID Numbers 4. 7. 12. 14. 17. 19. 22. 24. 30. 32 to 34. 36 38. 40. 42. 44. 46. 48. 
50 52 54 56 58 60 62 64 66. 68, 70 and 72. A polypeptide encoded by the nucleic acid molecule according to 
the invention is also provided, which polypeptide preferably comprises an amino acid sequence o/j^aymg the sequence 
of any of Sequence ID Numbers 4. 7. 12. 14. 17, 19. 22. 24, 30. 32 to 34. 36. 38. 40. 42. 44. 46. 48. 50. 52. 54. 56. 

10 58, 60, 62. 64, 66, 68. 70 and 72. ^. ♦ 

[0017] An expression vector according to the invention includes a vector having a nucleic acid according to tne 
invention operably linked to regulatory sequences, such as promoter regions, that are capable of effecting 
expression of said DNA fragments. The term "operably linked" refers to a juxta position wherein the components 
described are in a relationship permitting them to function in their intended manner. Such vectore may be 
transformed into a suitable host cell to provide for expression of a polypeptide according to the invention. Thus 

^5 In a further aspect, the invention provides a process for preparing polypeptides according to the invention which 
comprises cultivating a host cell, transformed or transfected with an expression vector as described above under 
conditions to provide for expression by the vector of a coding sequence encoding the polypeptides, and recovenng 
the expressed polypeptides. . . , 

[0018] The vectors may be. for example, plasmid, virus or phage vectors provided with an ongin of replication, 
optionally a promoter for the expression of said nucleotide and optionally a regulator of the promoter. The vectors 
may contain one or more selectable markers, such as, for example, ampicillin resistance. 

[0019] Polynucleotides according to the invention may be Inserted into the vectors descnbed in an antisense 
orientation in order to provide for the production of antisense RNA. Antisense RNA or other antisense nucleic acids 
may be produced by synthetic means. 

[0020] In accordance with the present invention, a defined nucleic acid includes not only the identical nucleic 
25 acid but also any minor base variations including in particular, substitutions in bases which result in a synonymous 
codon (a different codon specifying the same amino acid residue) due to the degenerate code in consen/ative amino 
acid substitutions. The term "nucleic acid sequence" also includes the complementary sequence to any single stranded 
sequence given regarding base variations. 

[0021] The present invention also advantageously provides nucleic acid sequences of at least approximately 10 
contiguous nucleotides of a nucleic acid according to the invention and preferably from 10 to 50 nucleotides. These 
sequences may, advantageously be used as probes or primers to initiate replication, or the like Such nucleic acid 
sequences may be produced according to techniques well known in the art, such as by recombinant or synthetic 
means They may also be used in diagnostic kits or the like for detecting the presence of a nucleic acid according 
to the invention. These tests generally comprise contacting the probe with the sample under hybndising conditions 
and detecting for the presence of any duplex or triplex formation between the probe and any nucleic acid in the 

35 sample. ui *u 

[0022] According to the present invention these probes may be anchored to a solid support. Preferably, they are 
present on an array so that multiple probes can simultaneously hybridize to a single biological sample The probes 
can be spotted onto the array or synthesised In situ on the array. (See Lockhart et ai, Nature Biotechnology, vol. 14. 
December 1996 "Expression monitoring by hybridisation to high density oligonucleotide an^ays . A single array can 
contain more than 100. 500 or even 1 ,000 different probes in discrete locations. 

40 [0023] Advantageously, the nucleic acid sequences, according to the invention may be produced using 
such recombinant or synthetic means, such as for example using PGR cloning mechanisms which generally involve 
making a pair of primers, which may be from approximately 10 to 50 nucleotides to a region of the gene which is 
desired to be cloned, bringing the primers into contact with mRNA. CDNA. or genomic DNA from a human cell, 
performing a polymerase chain reaction under conditions which bring about amplification of the desired region 
isolating the amplified region or fragment and recovering the amplified DNA. Generally, such techniques as defined 

^ herein are well known in the art, such as described in Sambrook et al (Molecular Cloning: a Laboratory Manual, 1989). 
[0024] The nucleic acids or oligonucleotides according to the invention may carry a revealing label. Suitable 
labels include radioisotopes such as ^^^P or^dS, enzyme labels or other protein labels such as biotin ©rjluorescent 
markers. Such labels may be added to the nucleic acids or oligonucleotides of the invention and may be detected 
using known techniquesperse. ^ 

50 [0025] The polypeptide or protein according to the invention includes all possible amino acid vanants encoded 
by the nucleic acid molecule according to the invention including a polypeptide encoded by said molecule and having 
conservative amino acid changes. Polypeptides according to the Invention further include vanants of such sequences, 
including naturally occurring allelic variants which are substantially homologous to sa'd polypeptides. In this 
context, substantial homology is regarded as a sequence which has at least 70%. preferably 80 or 90% amino acid 
homology with the polypeptides encoded by the nucleic acid molecules according to the invention. 
[0026] A nucleic acid which is particularly advantageous is one comprising the sequences of nucleotides 
illustrated in Figures 1 which is specific to Candida albicans with no functionally related sequences m other 
prokaryotic or eukaryotic organism as yet identified from the respective genomic databases. 
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[0027] Nucleotide sequences according to the invention are particularly advantageous for selective therapeutic 
targets for treating Candida albicans associated infections. For example, an antisense nucleic acid capable of binding 
to the nucleic acid sequences according to the invention may be used to selectively inhibit expression of the 
corresponding polypeptides, leading to impaired growth of XheCandida albicans with reductions of associated 
illnesses or diseases. 

^ [0028] The nucleic acid molecule or the polypeptide according to the invention may be used as a medicament, or 
in the preparation of a medicament, for treating diseases or conditions associated vAthCandida albicans infection. 
[0029] Advantageously, the nucleic acid molecule or the polypeptide according to the invention may be provided 
in a pharmaceutical composition together with a pharmaceutically acceptable carrier, diluent or excipient therefor. 
[0030] Antibodies to the protein or polypeptide of the present invention may. advantageously, be prepared by 

10 techniques which are known in the art. For example, polyclonal antibodies may be prepared by inoculating a host 
animal such as a mouse. v\^th the polypeptide according to the invention or an epitope thereof and recovenng irnmune 
serum.' Monoclonal antibodies may be prepared according to known techniques such as described by Kohler R. and 
MilsteinC, Nature (1975) 256, 495-497. .... 
[0031] Antibodies according to the invention may also be used in a method of detecting for the presence of a 
polypeptide according to the invention, which method comprises reacting the antibody vAXh a sample and identifying 
any protein bound to said antibody. A kit may also be provided for performing said method which compnses an 
antibody according to the invention and means for reacting the antibody with said sample. 

[0032] Proteins which interact with the polypeptide of the invention may be identified by investigating protein- 
protein interactions using the two-hybrid vector system first proposed by Chien al (1991 ). 

[0033] This technique is based on functional reconstitution in vivo of a ti-anscription factor which activates a 
20 reporter gene More particularly the technique comprises providing an appropriate host cell with a DNA construct 
comprising a reporter gene under the control of a promoter regulated by a transcription factor having a DNA binding 
domain and an activating domain, expressing in the host cell a first hybrid DNA sequence encoding a first fusion of 
a fragment or all of a nucleic acid sequence according to the invention and either said DNA binding domain or said 
activating domain of the transcription factor, expressing in the host at least one second hybrid DNA sequence, such 
as a library or the like, encoding putative binding proteins to be investigated together with the DNA binding or 
25 activating domain of the transcription factor which is not incorporated in the first fusion; detecting any binding of 
the proteins to be investigated with a protein according to the invention by detecting for the presence of any 
reporter gene product in the host cell; optionally isolating second hybrid DNA sequences encoding the binding protein. 
[0034] An example of such a technique utilises the GAL4 protein in yeast. GAL4 is a transcriptional activator of 
galactose metabolism in yeast and has a separate domain for binding to activators upstream of the galactose 
metabolising genes as well as a protein binding domain. Nucleotide vectors may be constructed, one of which 
comprises the nucleotide residues encoding the DNA binding domain of GAL4. These binding domain residues may be 
fused to a known protein encoding sequence, such as for example the nucleic acids according to the invention. The 
other vector comprises the residues encoding the protein binding domain of GAL4. These residues are fused to 
residues encoding a test protein. Any interaction between polypeptides encoded by the nucleic acid according to the 
invention and the protein to be tested leads to transcriptional activation of a reporter molecule in a GAL-4 
transcription deficient yeast cell into which the vectors have been transformed. Preferably, a reporter molecule 
such as p-galactosidase is activated upon restoration of transcription of the yeast galactose metabolism genes. 
[0035] Further provided by the present invention is one or more Candida albicans cells comprising an Induced 
mutation in the DNA sequence encoding the polypeptide according to the invention. 

[0036] A further aspect of the invention provides a method of identilying compounds which selectively inhibit or 
interfere with the expression, or the functionality of polypeptides expressed from the nucleotides sequences 
according to the invention or the metabolic pathways in which these polypeptides are Involved and which are cntical 
for grovirth and survival of Candida albicans, which method comprises (a) contacting a compound to be tested with one 
or more Candida albicans cells having a mutation in a nucleic acid molecule according to the invention which 
mutation results in overexpression or underexpression of said polypeptides in addition to one or more wild type Cancf/tfa 
cells (b) monitoring the grov\rth and/or activity of said mutated cell compared to said wild type wherein differentia^ 
growth or activity of said one or more mutated Candida cells provides an indication of selective action of said 
compound on said polypeptide or another polypeptide in the same or a parallel pathway. 

[0037] Compounds identifiable or identified using the method according to the invention, may advantageously be 
used as a medicament, or in the preparation of a medicament to treat diseases or conditions associated with Candida 
albicans infection. These compounds may also advantageously be included in a pharmaceutical composition together 
with a pharmaceutically acceptable carrier, diluent or excipient therefor. 

50 [0038] A further aspect of the invention provides a method of identifying DNA sequences from a cell or organism 
which DNA encodes polypeptides which are critical for growth or survival, which method compnses (a) preparing a 
cDNA or genomic library from said cell or organism in a suitable expression vector which vector is such that it can 
either integrate into the genome in said cell or that it permits transcription of antisense RNA from the nucleotide 
sequences in said cDNA or genomic library, (b) selecting transformants exhibiting impaired growth and determining 
the nucleotide sequence of the cDNA or genomic sequence from the library Included in the vector from said 

55 transformant. Preferably, the cell or organism may be any yeast or filamentous fungi, such as. for - 
example, Saccharomyces cervisiae,Saccharomyces pombe or Candida albicans. 

[0039] A further aspect of the invention provides a pharmaceutical composition comprising a compound according 
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to the invention together with a pharmaceuticaify acceptable carrier, diluent or exclpient therefor. 
[0040] A further aspect of the invention comprises nucleic acid nnolecules encoding proteins which are critca^ 
for survival and arowth of Candida albicans, which nucleic acid molecules comprise any of the sequences illustrated 
n 4ures 5 to 2^^^^^^ which are critical for sun/ival and growth of Candida albicans are also encompa^^^ 

llJithin tJie present invention, and which polypeptides comprise any of the ammo acid sequences illustrated in Figures 
29 to 39 

[0041] ' The present invention may be more clearly understood with reference to the accompanying example, which 

is purely exemplary, with reference to the accompanying drawings wherein. 

Figure 1 • is a diagrammatic representation of plasmid pGAL1 PNiST-1 . 

Figure 2 is a nucleotide sequence of plasmid pGAL1 PNiST-1 of Figure 1 . 

Figure 3- is a diagrammatic representation of plasmid pGAL1 PSiST-1 . 

ciaure4- is a nucleotide sequence Of plasmid pGAL1PSiST-1 of Figure 3. x • i 

F au res 5 to 28- illustrate the nucleotide sequences of oligonucleotides encoding polypeptides of previously 

^ unknown function isolated from Cancffda a/Wcans wrtiich are cntcal for Its survival and 



Figures 29 to 39: 



growth, according to the invention. 

illustrate the amino acid sequences of polypeptides fromCand/da albicans which are 
critical for its survival and growth, according to the invention. 



Example 1 

Identification of novel drug targets In C. albicans by anti-sense and disruptive Integration 

20 [0042] The principle of the approach is based on the fact that when a particular C. albicans mRNA is inhibited by 
producing the complementary anti-sense RNA. the corresponding protein will decrease. If this protein is cntical for 
growth or survival, the cell producing the anti-sense RNA will grow more slowly or vwlldie^ 

10043] Since anti-sense inhibition occurs at mRNA level, the gene copy number is irrelevant, thus allowing 
applications of the strategy even in diploid organisms. . , , j i„H..,.iKia 

[0044] Anti-sense RNA is endogenously produced from an integratve or episomal plasmid with an inducible 
promoter; induction of the promoter leads to the production of a RNA encoded by the insert *e Ptesmid. The 
insert wiil differ from one plasmid to another in the libraiy. The inserts will be denved from genomic DNA 
fragments or from cDNA to cover-to.the extent possible- the entire genome. » -.»i,»r th« 

[00451 The vector is a proprietary vector allowing integration by homologous recombination at either the 
homoloQous insert or oromoter sequence in the Candida genome. After introducing plasmids from cDNA or genomic 
fbrffinto C. iJfcanTSo^ant^ are screened for impaired growth after promoter (& thus ant-sense) indu^^^^ 
n Vhe presence of lithium acetate. Lrthium acetate prolongs the G1 phase and thus allovj^ 3"*;«f "f« fj-^^^^^^^^ 
prolonged period of time during the cell cycle. Transformants which show impaired growth in both induced and non- 
induced media, thus showing a growth defect due to integrative disruption, are selected as well. 
[0046] Transformants showing impaired growth are supposed to contain plasmids which produce anj'-sfnse RNA 
to mRNAs critical for growth or sun/ival. Growth is monitored by measuring growth-cunres over a Penod of time in a 
3« device (Bioscreen Analyzer, Labsystems) which allows simultaneous measurement of growth-curves of 200 

(OoSr'^^^^Subsequently plasmids can be recovered from the transformants and the sequence of their inserte 
determined thus revealing which mRNA they inhibrt. In order to be able to recover the genornic or cDNA insert which 
h?Kearated into the Candida genome, genomic DNA is isolated, cut with an enzyme which cuts only once into he 
40 fb^rvec^^r (S Smated approx. eJe?/ 4096 bp in the genome) and religated. P^R with pnme-s flan^ 

nsert will yield (partial) genomic or cDNA inserts as PGR fragments which can directly be sequenced^ This PGR 
analysi^ (on ligation reaction) will also show us how many integrations "9^*"°" 
reaction is transformed to E. coll and PGR analysis is performed on colonies or on plasmid DNA denved thereof. 
[0048] This method is employed for a genome wide search for novel C. albicans genes which are important for 
grov^ or survival. 

45 

Materials & MeWlodS 
Construction of p6al1PNiST-1 

[00491 The backbone of the pGAL1PNiST-1 vector (integrative anti-sense Sfil-Nofl vector) is pGEM1lZf(+) 
^ Premega Inc.). First, the CaMAL2£coRI/Sa/l promoter fragment from f B^50 P-^_Browne< ^ 

nto EcoRI/Sa/l-opened pGEM11Zf(+) resulting in the intermediate construct pGEMMAL2P-1. '"^ t*^® '^«!^('^ff'X^^ 
he GaURA3 selection marker was cloned as a Eco47lll/Xmnl fragment derived ^oj" PR'*^2^ The resuftng 
PGEMMAL2P-2 vector was A/ofl/H/ndlll opened in order to accept theWo(l-stuffer-Sfil cassette PPCKINiSCYCT- 
1 (Eagl/H/ndlll fragment): pMAL2PNiST-1 . Finally, the plasmid pGAL1PNiST-1 was constructed by xchang^ng 
thesXcyi36ll promoter in pMAL2PNiST-1 by theXhol/Smal GAL1 promoter fragment denved. from 

pRM2GAL1P. 

Constnicti n of pGall PSIST-1 
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[0050] The vector pGAL1PSiST-1 was created for cloning the small genomic DNA fragments (flanked by Sfil 
sites) behind the GAL1 promoter. The only difference with pGAL1PNiST-1 is that the hlFNp (staffer fragment) insert 
fragment in pGAL1PSiST-1 is flanked by two Sfi\ sites in stead of a Sfi\ and a Noi\ site as in pGALIPNiST-l. To 
construct pGAL1PSiST-1 the EcoRI-H/hdill fragment, containing hlFNp flanked by a Sfil and a A/ofl s^e of 
pMAL2pHiET-3 (unpublished) was exchanged by the EcoRI-H/ndlll fragment, containing hlFNp flanked by two Sfil 
sites from YCp50S-S (an £. coli/S. cerevisiae shuttle vector derived from the plasmid YCp50, which is deposited in 
the ATCC collection (number 37419; Thrash et al., 1985); anEcoRI-H/ndill fragment, containing the gene hlFNp. which 
is flanked by two Sfil sites, was inserted in YCp50, creating YCp50S-S). resulting into plasmid pMAL2PSiST-1. 
The ma}2 promoter from pMAL2PSiST-1 (by a Nae\-Fsp\ digest) was further replaced by the gah promoter from 
pGAL1PNiST-1 (via a Xho\'Sal\ digest), creating the vector pGAL1PSiST-1. 

Candida albicans genomic library 

* Preparation of the genomic DNA fragments 

[0051] A Candida albicans genomic DNA library vAth small DNA fragments (400 to 1.000 bp) was prepared. 
Genomic DNA of Candida albicans B2630 was isolated following a modified protocol of Blin and Stafford (1976). The 
quality of the isolated genomic DNA was checked by gel electrophoresis. Undigested DNA was located on the gel 
above the marker band of 26,282 bp. A little smear, caused by fragmentation of the DNA. was present To obtain 
enrichment for genomic DNA fragments of the desired size, the genomic DNA was partially digested. Several 
restriction enzymes (Alul Hae\\\ and Rsa\\ all creating blunt ends) were tried out The appropriate digest 
20 conditions have been determined by titration of the enzyme. Enrichment of small DNA fragments was obtained with 70 
units of A/ul on 10 pg of genomic DNA for 20 min. T4 DNA polymerase (Boehringer) and dNTPs (Boehnnger) were 
added to polish the DNA ends. After extraction with phenolchloroform the digest was s'se-fractionated on an agarose 
gel The genomic DNA fragments with a length of 500 to 1,250 bp were eluted from the gel by centrifugal filtraton 
(Zhu et al 1985). Sfil adaptors (5' GTTGGCCTTTT) or (5' AGGCCAAC) were attached to the DNA ends (blunt) to 
facilitate cloning of the fragments into the vector. Therefore, a 8-mer and 11-mer oligonucleotide (comprising the Sfil 
25 site) were kinated and annealed. After ligation of these adaptors to the DNA fragments a second size-fractionation 
was performed on an agarose gel. The DNA fragments of 400 to 1150 bp were eluted from the gel by centrifugal 
filtration. 

* Preparation of the pGAUPSiST-l vector fragment 

30 [0052] The small genomic DNA fragments were cloned after the GAL1 promoter In the vector pGAL1PSIST-1. 
Qiagen-purified pGAL1PSiST-1 plasmid DNA was digested withSfil and the largest vector fragment eluted from the 
gel by centrifugal filtration (Zhu et a/., 1985). Ligation with a control DNA fragment, flanked by Sfil sites, was perfomried 
as a control. The ligation mix was electropo rated to MCI 061 E. coli cells. Plasmid DNA of 24 clones was analyzed. In 
all cases the control fragment was inserted in the pGAL1PSiST-1 vector fragment 
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[0053] All genomic DNA fragments (450 ng) were ligated into the pGAL1PSiST-1 vector (20 ng). After 
electroporation at 2500V, AOpF circa 400.000 clones were obtained. These clones were pooled into three groups and 
stored as glycerol slants. Also Qiagen-purified DNA was prepared from these clones. A clone analysis showed an 
average insert length of 600 bp and a percentage of 91 for clones with an insert. The size of the library 
corresponds to 5 times the diploid genome. The genomic DNA inserts are sense or anti-sense onentated in the vector. 

Candida albicans cDNA library 

[0054] Total RNA was extracted from Candida albicans B2630 grown on respectively minimal (SD) and rich (YPD) 
45 medium as described by Chirgwin et ai in Sambrook et ai. mRNA was prepared from total RNA using the Invitrogen Fast 
Track procedure. 

[0055] First strand cDNA is synthesised with the Superscript Reverse Transcriptase (BRL) and with an oligo 
dT-A/o/l Primer adapter. After second strand synthesis, cDNA is polished with Klenow enzyme and punfied over a 
Sephacryl S-400 spun column. Phosphorylated Sfil adapters are then ligated to the cDNA. followed by digestion with 
the Not\ restriction enzyme. The Sfi\/Noi\ cDNA is then purified and sized on a Biogel column A150M. 
^ [0056] First fraction contains approximately 38.720 clones by transformation, the second fraction only 1540 
clones. Clone analysis: 

Fr. I: 22/24 inserts, 16 ^ 1000 bp. 4^ 2000 bp, average size: 1500 bp. 

55 Fr. II: 9/12 inserts, 3 ^ 1000 bp. average size: 960 bp cDNA was ligated in a A/ofl/Sfil opened pGAL1PNiST-1 vector 

(anti-sense) 
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Candida transformation 

[0057] The host strain used for transformation is a C. albicans ura3 mutant, CAM. which contains a deletion in 
orotidlne-5"-phosphate decarboxylase and was obtained from William Fonzi, Georgetown University (Fonzi and Iwin)^ 
CAI-4 was transformed with the above described cDNA library or genomic library using the P'^hia spheroplast modute 
5 (Invltrogen) Resulting transformants were plated on minimal medium supplemented with glucose (SD, 0.67 /o or l.d4/o 
Yeast Nitrogen base w/o amino acids + 2% glucose) plates and incubated for 2-3 days at 30 C. 

Screening for mutants 

[00581 Starter cultures were set up by inoculating each colony in 1 ml SD medium and incubating overnight at 
30X and 300 rpm. Cell densities were determined using a Coulter counter (Coulter 21; Coulter electronics 
250 000 cells/ml were inoculated in 1 ml SD medium and cultures were incubated for 24 hours at 30 C and 300 rpm^ 
Cultures were washed in minimal medium without glucose (S) and the pellet resuspended in 650 m S medium 8 pi of 
this culture is used for inoculating 400 \i\ cultures in a HoneywelMOO plate (Bioscreen analyzer; Labsystems). Each 
transformant was grown during three days in S medium containing LiAc; pH 6.0, with 2% glucose/2% maltose or 2 ^ 
galactosey2% maltose respectively while shaking every 3 minutes for 20 seconds. Optcal density were rrieasured 
every hour during three consecutive days and growth ounces were generated (Bioscreen analyzer Labsystems). 
[0059] Grov^h curves of transformants grown In respectively anti-sense non-inducing (glucose/maltose) and 
inducing (galactose/maltose) medium are compared and those transformants showing impaired growth upon ant-sense 
induction are selected for further analysis. Transformants showing impaired growth by virtue of integraton into a 
critical gene are also selected. 

20 

Isolation of genomic or cDNA inserts 

[0060] Putatively interesting transformants are grown in 1.5 ml SD overnight and genomic DNA is isolated using 
the Nucleon 1^1 Yeast kit (Clontech). Concentration of genomic DNA is estimated by analyzing a sample on an agarose 
oel 

2^ [00611 20 ng of genomic DNA is digested for three hours with an enzyme that cuts uniquely in the library vector 
(Sac! for the genomic libraiy. PstI for the cDNA library) and treated with RNAse. Samples are phenol/chloroform 
extracted and precipitated using NaOAc/ethanol. , x.i oma 

(0062J The resulting pellet is resuspended in 500 pi ligation mixture (1 x ligaton buffer and 4 units of T4 DNA 
ligase; both from Boehringer) and incubated overnight at 16°C. 

30 [0063] After denaturation (20 min 65'C), purification (phenol/chlorofonn extracton) and precipitabon 
(NaOAc/ethanol) the pellet is resuspended in 10 pi MilliQ (Millipore) water. 

PCR analysis 

[00641 Inverse PCR is performed on 1 (il of the precipitated ligation reaction using library vector specific 
35 prime s (oligo23 5' TGC-AGC-TCG-ACC-TC&ACT-G 3' and oligo25 5' GCG-TGA-ATG-TAA-GCG-TGA-C for the 
SenoiTiic library; SpGALNistPCR primer :5TGAGCAGCTCGCCGTCGCGC 3' and SpGALNistPCR pnmer 
5-GAGTTATACCCTGCAGCTCGAC 3' for the cDNA library; both firom Eurogentec) for 30 cycles each consistng of (a) 1 
min at 95 'C, (b) 1 min at 57 °C, and (c) 3 min at 72 'C. In the reaction mixture 2.5 units of Taq Polymerase 
(Boehringer) with TaqStart antibody (Clontech) (1:1) were used, and the final concentratons were 0^2 pM of each 
primer. 3 mM MgCI2 (Perkin Elmer Cetus) and 200 pM dNTPs (Perkin Elmer Cetus). PCR was performed in a 
Robocycler (Stratagene). 

Sequence determination 

[00651 Resulting PCR products were purified using PCR purification kit (Qiagen) and were quantified by 
4s comparison of band intensity on EtBr stained agarose gel with the intensity of DNA ^rker ban<fe The amount o 
PCR product (expressed in ng) used in the sequencing reaction is calculated as the length of the PCR product n 
basepairs divided by 10. Sequencing reactions were performed using the ABI Prism BigDye Terminator Cycle 
Sequencing Ready Reaction t<k according to the instructions of the manufacturer (PE Applied Biosystems. Foster City, 
CA) except for the folkwing modifications. u -j 

[00661 The total reaction volume was reduced to 15 pi- Reaction volume of individual reagents were changed 
so accordingly. 6.0 \i\ Terminator Ready Reaction Mix was replaced by a mixture of 3.0 pi Terminator Ready Reacton Mix 
+ 30 Ml Half Term (GENPAK Limited. Brighton. UK). After cycle sequencing, reaction mixtures were purified over 
Sephadex G50 columns prepared on Multiscreen HV opaque microtiter plates (Millipore Moteheim. Fr) and were dned 
in a speedVac. Reaction products were resuspended in 3 pi loading buffer. Following denaturation for 2 min at 95 C. 
1 ul of sample was applied on a 5% Long Ranger Gel (36 cm well-to-read) prepared from Singel Packs according to the 
supplier's instructions (FMC BioProducts, Rockland, ME). Samples were run for 7 hours 2X run on a ABI 377XL DNA 
55 sequencer Data collection version 2.0 and Sequence analysis version 3.0 (for basecalling) software packages are 
from PE Applied Biosystems. Resulting sequence text files were copied onto a server for further analysis. 
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[0067] Nucleotide sequences were imported in the VectorNTI software package (InforMax Inc. North Bethesda. 
MD USA) and the vector and insert regions of the sequences were Identified. Sequence similarity searches against 
public and' commercial sequence databases were performed with the BLAST software package (Altschul et aL, 1990) 
version 1 4 Both the original nucleotide sequence and the six-frame conceptual translations of the insert region 
5 were used as query sequences. The used public databases were the EMBL nucleotide sequence database (Stoesser et 
al 1998) the SWISS-PROT protein sequence database and its supplement TrEMBL (Bairoch and Apweiler. 1998). ana 
the ALCES Candida albicans sequence database (Stanford University. University of Minnesota). The comrriercial 
sequence databases used were the LifeSeq® human and PathoSeq™ microbial genomic databases (Iricyte 
Pharmaceuticals Inc., Palo Alto, CA, USA), and the GENESEQ patent sequence database (Den^/ent. London, UK)^ 
Three major results were obtained on the basis of the sequence similarity searches: function, novelty, and 
specificity A putative function was deduced on the basis of the similarity with sequences with a known functon 
the novetty was based on the absence or presence of the sequences in public databases, and the specificity was based 
on the similarity with vertebrate homologues. 

Methods 

[00681 Blastx of the nucleic acid sequences against the appropriate protein databases: Swiss-Prot for clones of 
which the complete sequence is present in the public domain, and paorfp (PathoSeq™)for clones of which the 
complete sequences is not present in the public domain. 

[0069] The protein to which the translated nucleic acid sequence corresponds to Is used as a starting point. The 
differences between this protein and our translated nucleic acid sequences are marked with a double line and 
20 annotated above the protein sequence. The following symbols are used: a one-letter ammo acid code or the ambiguity 

code X Is used if our translated nucleic acid sequence has another amino add on a certain position, 

the stop codon sign *is used if our translated nucleic acid sequence has a stop codon on a certain position, 

2^ The letters fs (frame shift) are used if a frame shift occurs in our translated nucleic acid sequence, and 

another reading frame is used, 

the words ambiguity or ambiguities are used if a part of our translated nucleic acid sequence is present in the 
proteins, but not visible in the alignments of the blast results. 

The phrase missing sequence is used if the translated nucleic acid sequence does not comprise that part of the 
protein. 

Blastx: compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) 
against a protein sequence database. 

Screening for compounds modulating expression of polypeptides critical for growrth and survival of C. albicans 

[0070] The method proposed is based on obsen/ations (Sandbaken et ai, 1990; Hinnebusch and Liebman 1991; 

Ribogene PCT WO 95/1 1969, 1995) suggesting that underexpression or overexpression of any component of a process 
^ (eg translation) could lead to altered sensitivity to an inhibitor of a relevant step in that process. Such an 

inhibitor should be more potent against a cell limited by a deficiency in the macromolecule catalyzing that step 

and/or less potent against a cell containing an excess of that macromolecule, as compared to the va\6 type iy\n) cell. 

[0071] Mutant yeast strains, for example, have shown that some steps of translation are sensitive to the 

stoichiometry of macromolecules involved. (Sandbakenef a/.). Such strains are more sensitive to compounds which 
45 specifically perturb translation (by acting on a component that participates in translation) but are equally 

sensitive to compounds with other mechanisms of action. 

[0072] This method thus not only provides a means to identify whether a test compound perturbs a certain process 
but also an indication of the site at which it exerts its effect. The component which is present in altered form or 
amount In a cell whose growth is affected by a test compound is potentially the site of action of the test compound. 
[0073] The assay to be set up involves measurement of growth of an Isogenic strain which has been modified only 
so in a certain specific allele, relative to a wild type (\/VT) C. albicans strain, in the presence of R-compounds. Strains can 
be ones in which the expression of a specific essential protein is impaired upon induction of anti-sense or strains 
which carry disruptions In an essential gene. An in silico approach to finding novel essential genes in a albicans will 
be performed. A number of essential genes identified in this way will be disrupted (in one allele) and the resulting 
strains can be used for comparative grovAh screening. 

Assay for High Thr ughput scr ening f r drugs 

[0074] 35 Ml minimal medium (S medium + 2% galactose + 2% maltose) is transferred in a transparent flat- 
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bottomed 96 well plate using an automated pipetting system (Multidrop, Labsystems). A 96-channel pipettor (Hydra. 
Robbins Scientific) transfers 2.5 pi of R-compound at 10*^^ M in DMSO from a stock plate into the assay plate. 
[0075] The selected C. albicans strains (mutant and parent (CAM) strain) are stored as glycerol stocks (15%) 
at -70 C The strains are streaked out on selective plates (SD medium) and incubated for two days at SO^C. For the 
parent strain, CAI-4. the medium is always supplemented with 20 pg/ml uridine, A single colony is scooped up and 

^ resuspended In 1 ml minimal medium (S medium + 2% galactose + 2% maltose). Cells are incubated at 30„C for 8 
hours while shaking at 250 rpm. A 10 ml culture is inoculated at 250.000 cells/mL Cultures are incubated at 30oC for 24 
hours while shaking at 250 rpm. Cells are counted in Coulter counter and the final culture (S medium + 2% galactose 
+ 2% maltose) is inoculated at 20.000 to 50.000 cells/ml. Cultures are grown at 30oC while shaking at 250 rpm until a 
final OD of 0.24 (+/- 0.04) 6nM is reached. 

io [0076] 200 pi of this yeast suspension is added to all wells of IVIW96 plates containing R-compounds in a 450 pi 
total volume. MW96 plates are incubated (static) at 30oC for 48 hours. 
[0077] Optical densities are measured after 48 hours. 

[0078] Test grov/th is expressed as a percentage of positive control growth for both mutant (x) and wild type (y) 
strains. The ratio (x/y) of these derived variables is calculated. 
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SEQUCKCB LZSTIMG 



(1) CENEAAI INFORMATION: 

(i) APPLICANT: 

(A) HAKE: Jan99en PhaziMeeutica 

(B) street; Turnhoucs6tf«g 30 

(C) CITY: 8eera« 

(B) COUNTRY: Belglun 

iW) rOSTAI. CODE (SKF): B-2340 

<G) TELEPHONE: 4-32 (0)14/(0.21.11 

(H) TEUFAX: 432 (0) 14/(0.28.41 

(ii) TITLE or IMVEHTIOK: DRUG TARGETS IV CANDIDA ALBICANS 

(ill) NUMBER or SEQUENCES: 72 

(iv) COMPUTER READABLE FORM: • 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D> SOFTWARE: Patantln R«l*a3« il.O. Vtraion 11.30 (BPO) 

(Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB $817796.7 

(B) PILING DATE: 14-AUG-1998 



(2) INFOPMATION FOR SEQ ID NO: I: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25S base paira 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(lii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
AACGTTCGTG CAAAAGCCTA TACTGCTGAT ATCCACGCAG ATGAAGAGCA AGTTTAATCA 
ACTCTTTCTC AATTAATGCT GTACTTGrTT TCATTTTATT TGCTGGCATT TAAAGAATAC 
CCATAGTTCA GAAAATAAAA TTGAAAAATT TAAAAAAAAA CGCAATATCA TTCATTTTTT 
TTGTTTTTTT CACAATAATA TTAATATGTA GTTACCAATG TTTTTAGATT TTATATGTTT 
TGAAAAAATA GTTTG 

(2 J INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: (48 base pilrs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

tii) MOLECULE TYPE: cDHA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 2: 
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AACCTCTTAT TCGCTTCTAG TGTCTCAATT GGTTATCCAT TAACATCTAT TCCCAACTCC 
ATCATTATTG GCAATAAATA AATGGGTGTT ATATCTATTG GTAATAACTA AACTGGTGTC 
AATTCAATTC CAATATGCTC ATGACAATTG AAAGTGTTAC TGTTCTGGTT TACATATTCT 
ACACGTTACA ACTATTGATT GGTTAGAAGT TTGGTTTCAA CATCACCTGT TGCTAAGAAT 
AAATGTTCGT CATATCAATT GAATCATTTC TTGCTGITAT GGTAAGTAAA TGCTGOTTAT 
ATCTATTATC TACAACCACC AAGTGATAAA TGCTGAACCC TACTCACCAA CTCTTATCCT 
OCTTGTATCT ATTGACTAAA ACTACCCTAC GGATAAATGC TG^CGTGG TTACCAACTG 
TTATGCTCGT TGTATCTATT AACTGCAACC ACCAAATGAT AAATGCTGAA CCAIAATTAC 
CAACTGTTAC ATTCCTGGTA CTACATTAAC AATAAATGCT GCATCTACAA GTACCACCTG 
TTCTGTTAAT AAATGCTGCA CCTGCTAGTA CAACTGTTGC TGGTCATGAT AGTTACTACA 
CATTACACAC CAGACAGTGG CAAACAAGGT TATGTAGAAA CCAACGTT 
(2) INFORMATION FOR SEQ 10 NO: 3: 

(X) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 bas« pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 
<D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: HO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AACTGTCCTG TGAAGACGAA CATCACAACC ACAATCATGG TCATAACCAA AATCACAATC 
ATGTTGCTCC TATTCCTACA ACAGCTGGAC AATCATTAAA TAATAAAATT GATACATCTA 
AAGTGACASC TCTCAACATG GCCAACTCTG CTGACGATCT AGCAAAAGTT TTCAAAGATT 
CGACTAAAAA ATATCAAATC AAACCAATTA TCAAATCAGA CAGTGATGAA CAAATGATTA 
TCAACATTCC ATTTCTTAAT GGTAGTGTCA AATTGTATTC GATAATTCTA CGTACCAATG 
GGGATTTGTA TTGTCCCAAA ACAATAAAAT TATTCAAAAA TGACACATCA ATTGATTTTG 
ATAATCTGGA TTCGAAGAAA CCAATACAGG TGTTAACTCA TCCTCAAGTT GGTGTTGCTA 
ATAATGATAG CGATGATCTT CCAGAGTTTT TGGAATCAAA TAACGATGAC GATTTTGTCG 
AACATTATGT GTCTCGACAT AAATTCACTG GOGTAAATCA ATTGACAATA TTTATTGAAG 
ATATTTATGA TGAACGAGAA GAAGAGTGTC ATTTACATTC AATTGAATTG AGAGGGGAAT 
TCACTGAATT AAACAAAGAC CCTGTCATTA CATTATATGA ACTGGCTGCT AATCCTGCTG 
ATCATAAGAA TTTAACGATT GTTGAAAATC AAAATCTAGC ATAAAACAAA GAAGTGAAAG 
GTATCAGATA AGCTGGTTAC ATTACAATTG ATCTAATTTA GAATCTCAAC GTATTTAAAT 
TTGCCGTTTT GCGATAATAT AACATGGTCA AGAACGTTGA ATCGATTACG TTAATGGTTT 
AGCTAATTGA TTTTTAGGAT CGAGTATTTA GAGTGAATAA ACAATAAACA AGAATGATGA 
ATTG 
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{2) INFORMATION FOR SEQ ID NO; 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 232 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) S2QUBNCB DESCRIPTION: SBO ID NO: 4: 

ser cys Clu Asp Glu His His Asn His Asn His Gly His Asn Gin Asn 
1 5 10 15 

His Asn His Val Ala Pro lie Pro Thr Thr Ala Gly Gin Ser Leu Asn 
20 25 30 

Asn Lys He Asp Thr Ser Lys Val Thr Ala Leu Asn Met Ala Asn ser 
35 40 45 

Ala Asp Asp Leu Ala Lys Val Phe Lys Asp Ser Thr Lys Lys Tyr Gin 

50 55 60 

He Lya Pro He He Lys Ser Asp Ser Asp Glu Gin Met He He Asn 
65 70 75 QO 

He Pro Phe Leu Aan Gly Ser Val Lys Lou Tyr Ser He He Leu Arg 
85 90 95 

Thr Asn Gly Asp Leu Tyr Cys Pro Lys Thr He Lys Leu Phe Lys Asn 
100 105 110 

Asp Thr Ser He Asp Phe Asp Asn Val Asp Ser Lys Lys Pro He Gin 

115 120 125 

Val Leu Thr His Pro Gin Val Gly Val Ala Asn Asn Asp Ser Asp Asp 
130 135 140 

Leu Pro Glu Phe Leu Glu Ser Asn Asn Asp Asp Asp Phe Val Glu His 
145 150 155 160 

Tvr Val Ser Arg His Lys Phe Thr Gly Val Asn Gin Leu Thr He Phe 
^ 165 170 175 

He Glu ASP He Tyr Asp Glu Gly Glu Glu Glu Cys His Leu His Ser 
Xeo 185 190 

He Glu Leu Arg Gly Glu Phe Thr Glu Leu Asn Lys Asp Pro Val He 
195 200 205 

Thr Leu Tyr Glu Ser Ala Ala Asn Pro Ala Asp His Lys Asn Leu Thr 

210 215 220 

He Val Glu Asn Gin Asn Leu Ala 
225 230 

(2) INFORMATION FOR SEQ ID KO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 608 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

<ii> MOLECULE TYPE: cDNA 
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(iii) HYPOTHBTZCAI.: NO 



10 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: S: 

AACCTACAAA AGACTCACAT GTGCTCTACA ATAAATTTCT GGATAAGCAT ATAAGTGATG 60 

AGCAACTATC ACACTTACTC GACAATCATA AACCCAATCT AGTGACTACC ACAACTTTAA 120 

TTGATTCTAT CAAAGAAAGT GAACTGTTAT ATAATACCAT GGACAGTTTG ATGATAAAAT 180 

CCATCAATTT TCCTGCAGCC ATGTACCACT CAAATGACAA CAATTCACAA TCACCAATCG 240 

AGTATTTATC TAACAGACTA AAATTGCTCA CACAAGAGTT ATACGAAGAT TCACTCAAAT 300 

ATGGCAACTT TCTACAGAGT GGTAATAATC ATATATATCA ATTACGAACT AGGATTTTAC 360 

'5 AGACCTTTGA TCAGTTCTCA GAGACTCACT AITCTTTAAA TGAACTATAT AATAAAGACA 420 

TGTCTTACGC AGAAACATTA CACGGATCTT TCAAGAAATG GGATCAACAA AGAAATAAAG 480 

TATTGTCCAA ACTGAAATCT ATAAAAAGTC ATACAAGCAA ACATGGAGCC AAATTATTCA 540 

CCTTATTAGA TGAAGTTAAT CATGTTGATG ACGAGATCAA ACTTTTGGAA CCAAAACTAC 600 

20 

AGCAGGTT 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
»!- (A) LENGTH: 1497 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHES5: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
3^ <iii) HYPOTHETICAL: NO 



608 



35 



40 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GATATCTGCA GAATTCGGCT TCTCTCTCAT CTTCACACAA TGCATTTTAC AAGTAGCCTA 60 

CTAGCCACCT TGATATCGTT TACATTACCG GTTCAAAGTT TGAATACTGA ATCTAGGACA 120 

ACTTCAAATA ACACAATATC AATACTTACA AACCATTTTC AAATACTAAA GGATTTGCTA 180 

CCATATAGCA AAACTTCTAA ACCGCAAATC AAGGAATCCA GACCGTTGAT TAAAGTTCTG 240 

AGAGATGGAG TGCCAATAAA TTTCCACAGG GCTCCGGCTA TAATAATGAA ATCGAACAAA 300 

ACAGACGATT TACTCAGGAA TAGCAATAAA ACAATGGTGC TAACTGAAAT AAAAACGATT 360 

ACTGAATTTC CAACTACCAC TCTTTCCCCT ACACAAGAAT TTCAAGCACT ACAGATAAAC 420 

45 CTTAACACGT TATCAATAGA GACTTCAACA CCAACATTCC AATCCCATGA CTTTCCACCG 480 

ATTACCATTO AAGACACACC CAAAACACTA GAACCAGAAO AATCGTCAGA TGCTTTGCAG 540 

AGGGATGCAT TTGATCAAAT TAAGAAACTA GAAAAATTGG TATTGGATTT GAGACTTGAA 600 

ATGAAAGAGC AACAAAAGAG TTTCAACGAT CAATTAGTGG ATATATATAC CGCAAGAAGT 660 

ATTGTTCCAA TTTATACTAC ACATATCGTC ACTTCGGCGA TTCCATCGTA TGTACCAAAA 720 

GAAGAAGTAA TGGTTTCACA TGATACTGCA CCAATTGTAA GTCGTCCTAG AACAGATATT 780 



55 



50 



.13. 



EP 0 982 401 A2 

CCAGTATCTC AACGAATTGA TACTATCTCA AAACATAAAA TGAATGGAAA AAATATATTG 840 

AACAACAATC CTCCGCCCAA TTCAGTTTTA ATACTTCCTC AGTTTCACTT CCATGAAAGA 900 

ATGGCCACCA AAACCGAAGT AGCTTATATG AAACCAAAAA TTGTCTGGAC CAACTTTCCA 960 

ACCACTACTG CAACGTCAAT CTTTGACAAT TTTATTTTAA AAAATCTTCT TGACGAAACG 1020 

GRTTCTGAAA TTGATAGTGC TGAAACTGAA TTCTCTGACC ATTATTATTA CTATIATACT 1080 

TACGAASATG ATGGTAAAGA AGACGATAGT GATCaMSATTA CGGCTCAAAT ACTATTATCC 1140 

AATTCASAAT TACGCACGAA GACGCCAAAT TTTGAGGATC CTTTTGAACA AATCAATATT 1200 

GAAGACAATA AASTAATATC TGTTAATACA CCAAAGACAA AGAAACCTAC TACAACAGTA 1260 

TTTGGCACTT CTACTAGTGC ATTATCAACT TTTGAAAGTA CAATATTTGA AAITCCCAAA 1320 

TTCTTTTATG GTAfiCAGAAG AAAACAACTG AGCTCATTCA AAAATAAGAA CAGTACAATC 1390 

AAATTTGATG TGTTTGATTG GATATTTGAA AGTCGTACTA CCAATGAGAA AGTACATGGA 1440 

TTAGTGTTGG TCTCTAGTGC TGTTCTACTA CGAACTTCTC TATTGTTCAT TTTGTAG 1497 
(2) ZNFOWCATIOK FOR SEQ IP NO: 7: 

(i) SEQUENCB CHARACTERISTICS: 

(A) LENGTH: 485 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL; NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met His Phe Thr Ser Ser Leu Leu Ala Thr Leu He Trp Phe Thr Leu 

15 10 15 

Pro Val Gin Ser Leu Asn Thr Glu Ser Arg The Thr Ser Aan Asn Thr 
20 25 30 

lie Ser He Leu Thr Asn His Phe Gin He Leu Lys Asp Leu Leu Pro 
35 40 45 

Tvr ser Lys Thr ser Lys Pro Gin He Lys Glu ser Arg Pro Leu He 
50 55 60 

Lys Val Ser Arg Asp Gly Val Pro He Asn Phe His Arg Ala Pro Ala 
£5 70 75 BO 

He He Met Lys Ser Asn Lys Thr Asp Asp Leu Val Arg Asn Ser Asn 
65 90 »5 

Lys Thr Met Val Leu Thr Clu He Lys Thr He Thr Glu Phe Ala Thr 
iOO 105 110 

Thr Thr Val Ser Pro Thr Gin Glu Phe Gin Ala Leu Gin He Asn Leu 
115 120 125 

Asn Thr Leu Ser He Glu Thr Ser Thr Pro Thr Phe Gin Ser His Asp 
130 135 140 

Phe Pro Pro He Thr He Glu Asp Thr Pro Lys Thr Leu Glu Pro Glu 
145 150 155 160 

Glu Ser Ser Asp Ala Leu Gin Arg Asp Ala Phe Asp Gin He Lys Lys 
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X65 170 175 

Leu Glu Lya Leu Val Leu Asp Leu Arg Leu Glu Met Lys Glu Gin Gin 
180 185 190 

Lys Ser Phe Asn Asp Gin Leu Val Asp He Tyr Thr Ala Arg Ser He 
' 195 200 205 

val Pro He Tyr Thr Thr His He Val Thr Ser Ala He Pro Ser Tyr 
210 215 220 

val Pro Lys Glu Glu Val Met Val Ser His Asp Thr Ala pro He Val 
225 230 235 240 

ser Arq Pro Arg Thr Asp He Pro Val Ser Gin Arg He Asp Thr He 
245 250 255 

ser Lys His Lys Met Asn Gly Lys Asn He Leu Asn Asn Asn Pro Pro 

' 260 • 265 270 

Pro Asn Ser Val Leu He Val Pro Gin Phe Gin Phe His Glu Arg Met 
275 280 285 

Ala Thr Lys Thr Glu Val Ala Tyr Met Lys Pro Lys He Val Trp Thr 
290 295 300 

Asn Phe Pro Thr Thr Thr Ala Thr Ser Met Phe Asp Asn Phe He Leu 

305 310 315 320 

Lys Asn Leu Val Asp Glu Thr Asp Ser Glu He Asp Ser Gly Glu Thr 
325 330 335 

Glu Leu ser Asp Asp Tyr Tyr Tyr Tyr Tyr Ser Tyr Glu Asp Asp Gly 
340 345 350 

Lys Glu Asp Asp ser Asp Glu He Thr Ala Gin He Leu Leu Ser Asn 

355 360 365 

Ser Glu Leu Gly Thr Lys Thr Pro Asn Phe Glu Asp Pro Phe Glu Gin 
370 375 380 

He Asn He Glu Asp Asn Lys Val He Ser Val Asn Thr Pro Lys Thr 
385 390 395 400 

Lys Lys pro Thr Thr Thr Val Phe Gly Thr Ser Thr Ser Ala Leu ser 
405 410 415 

Thr Phe Glu ser Thr He Phe Glu He Pro Lys Phe Phe Tyr Gly ser 
420 425 430 

Arg Arg Lys Gin Ser Ser Ser Phe Lys Asn Lys Asn Ser Thr He Lys 
435 440 445 

Phe Asp Val Phe Asp Trp He Phe Glu Ser Gly Thr Thr Asn Glu Lys 
450 455 460 

val His Gly Leu Val Leu Val Ser Ser Gly Val Leu Leu Gly Thr cys 
465 470 475 480 

Leu Leu Phe He Leu 
485 

INFORMATION FOR SEQ ZO NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1651 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESSr single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 



5 (xi) SEQUENCE DESCRIPTION: 5BQ ID NO: 8: 





GAlSCTCTTCC 


AGAGGCAACA 


AGCGGAAGM^ GCACAACGAA AGAAGGAATT TGAACAAAAG 


60 




GCCGMTTCA 


TCAAAGCATC 


ATTACTTGAA ATGCGCCGAA GAGAAATAGA GAGGCGGAAA 


120 


10 


CAGCAAAAGG 


AAAGGGAACA 


AAGACAAAAG GAGCACGAAG CAAAGAGGGA TATCACGATA 


180 


CAACAACTTT 


CAGAGCAGGA 


TTCACGGAGT AATCAAACTA AAGAAGAAGA GGAAGTGTTC 


240 




AAGAAGGCCC 


6GTCTACTAA 


TTCGGGAGCA GACGAGACTG GTTTGATGTC AGAIAAAGAG 


300 




TTTGATGATT 


CTGCATATTC 


ACCCGATTAT TTGTTTGAAG AGAATTTGTG GAATAAACCA 


360 


15 


AATCATCCAG 


ATACAAATCA 


TAAAACCAAA AAATATACTG AGAATGTGGT TGAAAATCTA 


420 




GATTCTCCAC 


CAAATGATAC 


ATCTGCGTAC AATTCAAGTT TTCATGATGA AACTAATATT 


480 




CAAAATGAGA 


TCCAAATACC 


AGAAAATGAC GAGTATGTAC CACAGATGAA AGCTACATCC 


540 


20 


AGTGTCAATA 


ATACCACCAT 


CCCTCCACAA AGAAGACATG AGTCACTTTC CACTTCTGAA 


600 


AACAAAAGAA 


GGAAATTTGA 


AACAGCCGAC GTTGCGGTTG ATCGGTTAGA TTCCCCAGTG 


660 




CGGGCACAAC 


CAGAAATATC 


TGGAAAATCC AAGTCTCCGA TAATCCCTGA TGTAATACTT 


720 




TTACTGGACG 


AAGAGACTGA 


AACTCCTGAA GCAAATCCTG TGCAGGACAA TAGTACATAT 


780 


25 


ATTCCTCAGG 


GGTCTTTAGG 


ACACGAATTT AGAAATATTT TGGAAGAGCA TCCACGTCAA 


840 




GTAAAGAATA 


AACAAAATTC 


TGGTGTTGCT TTTGCATTTC CGAATGCTTC CAAGAATACC 


900 




GAAAACAAAC 


TCCACTCTAA 


TTTCAAAGAT AAAGATGAAG GAATAATTGA TGTTGAAGCT 


960 


30 


TACGTACCTG 


ATGTCAAAGC 


AGCAACTTCA AACACCACCC CAGCAACAGG ACAAACATCA 


1020 


GCAAGGTCGG 


AAAAACTGCC 


ACCCTTACCT ACTCATATTC CAAATCCATC GACCATGAAT 


1080 




GAAGCTCGAC 


CTCATCCAAC 


AACTCCACAT AAAAGATCAA AAGTCATTTT CGATTTAAAA 


1140 




GATTTAGAAC 


AAAAGTTAGG 


TAATGATATT GAGGATTTGG ATTTTAAGGA TATGTATGAG 


1200 


35 


AGTTTGCCTG 


ACCATTCAAG 


TAAGGCAACA CCTAAAGACG ATATTTTAAC CCGTTCTAAA 


1260 




AGAAGACTTT 


ATACATATAC 


CGATGGAACA TCAAAGGCTG AAACGTTATC TACACCAATG 


1320 




AACAAAAATC 


CTGTTCGTGG 


ACATAGTACC AAGAAAAAGC TTAGTATCTT GGACATGCAT 


1380 


40 


GCCTCTTCTA 


AAATTCAAAG 


TCTTTTACCT CCACAACCGC CACAAATGTC AATTGATCCT 


1440 


TCTGTTTCCA 


AGCAAGTGTG 


GGCTAAATAC GTTGATGCAA TCTTGACTTA TCAAAGAGAA 


1500 




TTTTTCAATT 


ATAAAAAAGT 


GATTGTTCAA TACCAAATGG AACGGATAAA CAAAGACCTT 


1560 




GAACATTTTG 


ACGATATAAA 


TGATGGTTCA CACACTGAGA ATTTGGATAC TTTCAAGCAT 


1620 


45 


TGTTTAGAAC 


AAGATTATTT 


GGTTAGTTGA C 


1651 




(2) INF0W4AT10N FOR SEQ ID NO: 9: 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 463 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
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Uii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

AACCTGTTGA CGCGTTGTCT TTTTCTACCC OUTCTTTAAC AATCTTGCCA CTCAATTCAC 60 

TAISCCAAATA AACTTTAGAC TCACAACTCT AACACTGACT CGCCCCCCCC TGTTTAAACT 120 

CTAAATTACT TCACAGAGCC TTTACTACCT TAATTTAAGA TTATCTATTG TTTCTGTTCT 180 

TTTGCAATCA CCCTGACTCG TTTTTTTTTC ACCCAGTTTT TTCGTAAAAT CTGACCAAAA 240 

ATTTACAACT CTAATTTAAA ACTCTAAATA ACAATTAAAA CTCAATTCAG ACAAGTCCTT 300 

CTGCTCATTC TGACTCTTCT CTATTGTCTT TTGACTTPTT GTGTGTGACT ATTTTCATGA 360 

TCACCCCGTT TCTTGCATTT TTTTCAGTCA ACTTTTTCTC AAAATCAAGC CAAAAAAACA 420 

CATTTAACTG CCTATACAAC GCAAACCTAT TCAAAACAAG GTT 

(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 582 base pairs 

<B} TYPE: nucl«ic acid 

(C) STRANDBDNBS5: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
25 (iii) HYPOTHETICAL: NO 



10 



15 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
30 AACCTCCCCG TTAACCACTT CTAGGTATAC CATTTCATCT GACTGAATAA CTGGTTACTC 60 

GATTTGTTGT TGAAGAAAAG TGACCACCTA GWTTTTCTG CCAACATTTT TTGCGATGAG 120 
CCGTCGACGC GTTGTCTTTT TCTACCCCAC GTTTAACAAT CTTGCCAGTC AATTCCCTAG 180 
CCAAATAAAC TTTAGACTCA CAACTCTAAC ACTGACTCGT GCCCCCCTGT TTAAACTCTA 240 
AATTACTTCA CAGAGCCTTT ACTACCTTAA TTTAAGATTA TCTATTGTTT CTGTTTTTTT 
GCAATCACCC TGACTCGTTT TTTTTTCAGC CAGTTTTTTC GTAAAATCTG ACCAAAAATT 
TACAACTCTA ATTTAAAACT CTAAATAACA ATTAAAACTC AATTCAGACA AGTCCTTCTG 
40 CTCATTCTGA GTCTTCTCTA TTGTCTTTTG ACTTTTTGTG TGTGACTATT TTCATGATCA 

CCCCGTTTCT TGCATTTTTT TCAGTCAACT TTTTCTCAAA ATCAAGCCAA AAAAACACAC 540 
CTTTAACTAC CTATACAACG CAAACCTATT CAAAACAACG TT 
45 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 



50 



55 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



300 
360 
420 
480 



582 
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(ix) FEATURE: 

(A) NAME/KEY: raise feature 
(B> LOCATION: 183 

(D) OTHER INFORMATIOM:/note- - A or T" 
(ix) FEATURE: 

(A) KAME/KBY: mi5c_feacure 

(B) LOCATIOW:564 

<0) OTHER INFORMATIOH: /note- "Y - C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AACCATAAAT ATGCCAAGAT TTAAACAAGT TGATGTATTC ACCAATGTCA AATATTTGGG 60 

TAATCCAGTT GCCGTTATTT ATGATAGTGA TAATTTAACC ACTCAAGAAA TGCAAAAAAT 120 

TGCTCGATGG ACAAATTTAT CAGAAACAAC ATTTATATTG ACTCCAAAAT CATCAATTGC 180 

TGWTTATAGT ATTAGAATTT TCACTTCTGG TGGGAATGAA TTACCATTTG CTGGTCATCC 240 

TACTTTAGGT ACTGCATTTG CATTATTGGA AGATGGTAAA ATAAAACCAA ATGACAATGG 300 

ACAAATAATT CAAGAATGTG GTGCTGGATT AGTGAAAATA TCCGTTGAAA AAACACCTAA 360 

TAATAATAGT AATGAGTTGC CGTTTTTGTT ATCTTTTGAA TTACCATATT TCAAATTTCA 420 

TGAAATTGAT GACAAAGTAA TCGAGGAATT ACAACATTCA TGGAATGGAA CCAATATTAT 480 

TGGTAAACCG GTACTTATTG ATGCTGGTCC AAAATGGGCA GTTTTCCAAC TTGGCTCCGG 540 

TAAAGAAGTA TTAGACTTGA ATGYTGATTT AGCACAAATT GAGAGATTAA GTTTAGAAAA 600 

TGGTTCGACA GGAATTGGTG TCTTTGGAAA ACATAATGAA AATGGTGATT CGGTCGAATT 660 

GAGAAATATT CCTCCTGCTG TTGGAGTCGC TGAAGATCCT GCTTGTGGAA GTGGATCAGG 720 

TGCTATTGGA CCATATTTGG CAAATCACGT TTTCAATGAA AAGGAAAAAT TTACAATTGA 780 

30 TATTTCTCAA GGTAAACCAA TTGAAAGAGA TGCTAAGATT CAAGTTAAAC TTAATCGTCT 840 

TACCACCAAA AATGGTGATT TATCTATTCA TGTTGGTGGT CATGCCATCA CTTGTTTCGA 900 

AGGTACTTAT TCTATTTAAA ACTTGATATA ATTCTTGRGT TATATCTAAT TTATCTAATT 960 

CACTTGTCCC TGGAGTAGTT TGATCTAATT GATGTAATTT ATTTAATAAA TCACGTTCTA 1020 

AATCAGTTTG TTTAGATAAA TCATTTAATA AATCATCTTC AGCATT ^06S 
(2) INFORMATION FOR SEQ ID NO: 12: 



10 



15 



20 



25 



35 



40 



50 



55 



ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: peptide 
45 (iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Pro Arg Phe Lys Gin Val Aap Val Phe Thr Aan Val Lys Tyr Leu 

15 10 15 

Cly Asn Pro Val Ala Val lie Tyr Asp Ser Asp Asn Leu Thr Thr Gin 
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20 25 ' 30 

Glu Met Gin hys He Ala Arg Trp Thr Asn Leu Ser lu Thr Thr Phc 
35 40 45 

He Leu Thr Pro Lys Scr Ser He Ala Xaa Tyr Ser He Arg He Phe 
SO 55 60 

Thr Ser Cly Gly Asn Glu Leu Pro Phe Ala Gly His Pro Thr Leu Gly 
65 70 75 80 

Thr Ala Phe Ala Leu Leu Glu Aap Gly Lys He Lya Pro Aan Asp Aan 
65 90 95 

Gly Gin He He Gin Glu Cys Gly Ala Gly Leu Val Lya He ser Val 
100 105 110 

Glu Lys Thr Pro Aan Asn Aan Ser Aan Glu Leu Pro Phe Leu Leu Ser 

115 120 125 

Phe Glu Leu Pro Tyr Phe Lya Phe His Glu He Aap Aap Lya Val lie 
130 135 140 

Glu Glu Leu Gin Hia Ser Trp Aan Gly Thr Aan He He Gly Lya Pro 
145 150 155 160 

Val Leu He Aap Ala Gly Pro Lya Trp Ala val Phe Gin Leu Gly ser 

US 170 175 

Gly Lya Glu V«l Leu Aap Leu Aan Xaa Aap Leu Ala Gin He Glu Arg 
180 185 190 

Leu Ser Leu Glu Aan Gly Trp Thr Gly He Gly Val Phe Gly Lya Hia 
25 195 200 205 

Aan Glu Aan Gly Aap Ser Val Glu Leu Arg Aan He Ala Pro Ala Val 

210 215 220 

Gly Val Ala Glu Aap Pro Ala Cys Gly Ser Gly Ser Gly Ala He Gly 
225 230 235 240 

Ala Tyr Leu Ala Aan Hia Val Phe Aan Glu Lya Glu Lya Phe Thr He 

245 250 255 

Aap He Ser Gin Gly Lya Pro He Glu Arg Aap Ala Lya He Gin Val 
260 265 270 

35 Lya Val Aan Arg Leu Thr Thr Lya Asn Gly Asp Leu Ser He Hia Val 

275 280 285 

Gly Gly Hia Ala He Thr Cya Phe Glu Gly Thr Tyr Ser He 
290 29S 300 



10 



15 



20 



30 



40 



(2) INFORKATIOK fOR SEQ ID NO: 13: 

(i) SEQUBHCB CHARACTERISTICS: 

(A) LENGTH: 2829 base pairs 

(B) TYPE: nucleic acid 

(C) STFANDEDNESS: aingle 

(D) TOPOLOGY: linear 

45 Hi) MOLECULE TYPE: cDKA 

(iii) HYPOTHETICAL: NO 



50 (xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 13: 

ATGACGGAAA CTGTGATAGA AAAGAAAAGA AAGGTTGATT TAAATGCCTC AGGTATTACA 



55 



60 
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AAACAACCAA AAGCTTCTAX AATCTTCAGT CCATTCAGAG TTTTAGGGAA TGTTACAGAC 
TCAACTCCTT TTGCCATCOO GACATTAGGT TCAACATTTT ATGCTGTCAC TTCTGTTGGC 
AGATCTTTCC AAATTTATGA CTTGCCTACA TTACATTTAT TGTTTGTTTC CCAAACTCAA 
ACTCCTTCAA GAATTACAAG TTTGGCTGCA CACCATCACT ATGTCTATGC ATCTTATGGT 
GATCGTATTG GTATTTTTAG ACGTGGTAGA TTAGA6CAT6 AATTGGTTTG TGAAGGGAAC 
TCTACAGTTA ACCAATTATT AGTATTTGGA GAATACCTTA TTCCTACCAC ATTAGAACGT 
GATATTTTCG TATTTAGAAA AACTGAACGA AAGAAATICC CAACTGAATT ATACACTACA 
ATCAQAATAA TTAATTCTTT AGTTGAAGGA GAAATTGTGG GATTAAXTCA TCCACCTACG 
TATTTAAATA AACTAATTGT TGCTACTACT CAATCT6TGT TTGTTATAAA TGTGAfiAACT 
GGCAAATTAT TATACAAATC CCGGGAATTA CAATTCGAAG GCGAAAAGAT TTCATCAATC 
GAAGCTGCTC CAGTTTTGGA TGTAATTGCT GTTGGTACAT CTAATGGAAA TGTATTTTTA 
TTCAACATTA AAAAGGGGAA ACTGTTGGGC CAAAAAATTA TTACTTCTGG AACTGAATCT 
TCTTCGAAAG TTGCCTCGAT CTCTTTTAGA ACAGATGGAG CACCTCATTT GGTTGCTGGT 
TTGAATAACG GGGACTTATA TTTCTACGAT TTAGACAAGA AATCACGTGT TCATGTTTTG 
AGAAATCCCC ATAAAGAGAC TCATGGGCGT GTTGCAAACG CCAAATTTTT GAATGGTCAA 
CCAATAGTAT TATCAAATGG TGGTGATAAT CATTTGAAAG AATTTGTTTT TGATCCTAAT 
TTAACCACTT CGAATTCATC CATTGTTCCT CCTCCAAGAC ATCTCAGATC TAGAGGTGGG 
CATTCAGCAC CACCAGTAGC TATTGAATTT CCTCAAGAAG ATAAAACCCA TTTTTTATTG 
AGTGCTTCTA GAGATAAAAC ATTTTGGACA TTCTCTTTGA GAAAAGATGC TCAAGCACAG 
GAAATGTCTC AAAGATTGCA AAAATCTAAG GATGGTAAAA GACAGGCTGG ACAAGTTGTT 
TCTATGAGAG AGAAATTCCC AGAAATCATT TCCATTTCAT CCTCTTATGC CAGAGAAGGt 
GATTGGGAAA ATATCATAAC CGCCCACAAG GATGAAACTT TTGCGAGAAC ATGGGATTCA 
AGAAATAAAA GACTCGGTAC ACATTTGTTA AACACTATTG ATGGTGCCAT TCTGAAATCT 
GTATGTGTGT CTCAGTGTGG TAATTTTGGT TTAGTGGGAT CATCACTGGC TGGTATTGGA 
TCATAGAACC TTCAAAGTGG ATTGTTCCGT AAAAAATATG TTTTACATAA ACAAGCTGTC 
ACCGGTTTAG CAATTGATGG AATGAATAGA AAAATGGTTA GTTGTGGTTT AGATGGAATT 
GTGGGATTCT ATGATTTTGG AAAGTCTGTC TATTTAGGCA AATTACAACT TGAAGCACCT 
ATAACATCCA TGATATATCA CAAACTGTCT GATCTTGTTG CTTGTGCCTT GGATGATTTG 
TCCATAGTTG TTATTGACGT GACTACTCAA AAAGTCATAA GAATATTATA TCGTCATACC 
AACAGAATTT CAGGAATGGA TTTCTCGCCT GATGGGAGAT GGATAGTTTC ACTTGCATTG 
GACTCCACTT TGCGAACTTG GGACTTCCCA ACTGGTGGTT GTATTGATGC GGTGATTTTA 
CCAATTCTGG CAACTCCAGT TAAATTTTCT CCTATTGCTC ATATCTTAGC GACAACACAI 
GTCTCTGGAA ATGGTGTATC CTTATCGACT AATCGTGCCC AGTTCAAGCC TGTCTCCACC 
AGACACGTAG AAGAAGATGA GTTTTCAACT ATTTTATTAC CAAATGCTTC TGGAGATGGC 
GGTTCAACAA TGCTAGACGG GTTTTTCGAC GAGGATTCTA ATGAAGACGG CACTATTGAT 
GAACAGTATA CATCTGCTGC TCAAATTGAT GCATCCTTGA TTACTTTATC ATCAGAGCCA 
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JWJATCAAAAT XCAACACTTT ATTGCATTTG GATACCATTA AACAACAAAG CAAACCGAAA 2280 

GAAGCACCTA AAAAACCAGA AAATGCACCT TTCTTTTTAC AATTGACTGG ACAAGCAGTT 2340 

GCTGATAGGG CATCGGTTGC TGAAGGCAAA ACTTCAGAAC AAACAAATAA CACTGTTGAA 2400 

GAAACCAACA GCAAATTGCG TAAATTGGAT ACAAACGGTA ACCACGCATT TGAAACTGAA 2460 

TTCACAAAAC TATTAAGGGA AGCTGGAGAG AGTGO^CAAT TTGAAACATT TTTGACTTAC 2520 

TTACTTAACT TATCTCCTCC TGTATTGGAC TTGGAAATTA GATCACTTAA TTCATTTGTT 2580 

CCATTGWrtC AAATGACAAA TTTTATTCAA GCTTTAAATG CTGCTTTGAA ATCAAACGCA 2640 

AATTATGJUm TATGGGAAAC TTTATATCCC ATGTTTTTCA ACATACATGG TGATGTTATC 2700 

CATCAGTTTG AAAATGAMC TAGTCTTCAT GJVAGCTTTGG AAGAATACAG ACAGTTAAAT 2760 

GATGAAAAGA ATAACAAAAT GGATTCTTTA GIGAAATATT GTGCTAGTAT CGTAAGTTTT 2820 
ATTAGTTAG 

(2) INFORMATION FOR SBQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 942 amino acids 

(B) TYPE: amino acid 

(C) 5TRANDEDNES5: 

<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: paptide 
(iii) HYPOTHETICAL! NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Thr Glu Thr Val lie Glu Lys Lys Arg Lya Val Asp Leu Asn Ala 
15 10 15 

Ser Gly He Thr Lya Gin Pro Lys Ala Ser Lys He Phe Ser Pro Phe 

20 25 30 

Arg Val Leu Gly Asn Val Thr Asp Ser Thr Pro Phe Ala Met Gly Thr 
35 40 45 

Leu Gly Ser Thr Phe Tyr Al* Val Thr Ser Val Gly Arg Ser Phe Gin 
50 55 60 

He Tyr Asp Leu Ala Thr Leu His Leu Leu Phe Val Ser Gin Thr Gin 
65 70 75 80 

Thr Pro Ser Arg He Thr Ser Leu Ala Ala His His His Tyr Val Tyr 
85 90 9S 

Ala Ser Tyr Gly Asp Arg He Gly He Phe Arg Arg Gly Arg Leu Glu 
100 105 HO 

His Glu Leu Val Cys Glu Gly Asn Ser Thr Val Asn Gin Leu Leu Val 
115 120 125 

Phe Gly Glu Tyr Leu He Ala Thr Thr Leu Glu Gly Asp H© Phe Val 
130 135 140 

Phe Arg Lys Thr Glu Gly Lys Lys Phe Pro Thr Glu Leu Tyr Thr Thr 
145 150 155 IW 

He Arg He He Asn Ser Leu Val Glu Gly Glu He Val Gly Leu He 
165 170 175 



EP 0 982 401 A2 



His Pro Pro Thr Tyr Leu Asn Lys Val He Val Ala Thr Thr Gin Ser 
180 185 190 

Val Phe Val He Aan Val Arg Thr Gly Lys Leu Leu Tyr Lys Ser Arg 
195 200 205 

Glu Leu Gin Phe Glu Gly Glu Lys He Ser Ser He Glu Ala Ala Pro 
210 215 220 

Val Leu Asp Val He Ala Val Gly Thr Ser Asn Gly Asn Val Phe Leu 

225 230 235 240 

Phe Asn He Lys Lys Gly Lys Val Leu Gly Gin Lys He He Thr Ser 
245 250 255 

Glv Thr Glu Ser Ser Ser Lys Val Ala Ser He Ser Phe Arg Thr Asp 
260 265 270 

Gly Ala Pro His Leu Val Ala Gly Leu Asn Asn Gly Asp Leu Tyr Phe 
275 280 285 

Tyr Asp Leu Asp Lys Lys Ser Arg Val His Val Leu Arg Asn Ala His 

290 295 300 

Lvs Glu Thr His Gly Gly Val Ala Asn Ala Lys Phe Leu Asn Gly Gin 
305 310 315 320 

Pro He Val Leu Ser Asn Gly Gly Asp Asn His Leu Lys Glu Phe Val 
325 330 335 

Phe Asp Pro Asn Leu Thr Thr Ser Asn Ser Ser He Val Pro Pro Pro 
340 345 350 

Ara His Leu Arg Ser Arg Gly Gly His Ser Ala Pro Pro Val Ala He 
355 360 365 

Glu Phe Pro Gin Glu Asp Lys Thr His Phe Leu Leu Ser Ala Ser Arg 
370 375 380 

Asp Lys Thr Phe Trp Thr Phe Ser Leu Arg Lys Asp Ala Gin Ala Gin 
385 390 395 400 

Glu Met Ser Gin Arg Leu Gin Lys Ser Lys Asp Gly Lys Arg Gin Ala 
405 410 415 

Gly Gin Val Val Ser Met Arg Glu Lys Phe Pro Glu He He Ser He 
420 425 430 

Ser Ser Ser Tyr Ala Arg Glu Gly Asp Trp Glu Asn He He Thr Ala 
435 440 445 

His Lys Asp Glu Thr Phe Ala Arg Thr Trp Asp Ser Arg Asn Lys Arg 
450 455 460 

Val Gly Arg His Leu Leu Asn Thr He Asp Gly Gly He Val Lys Ser 
465 470 475 480 

Val Cys Val Ser Gin Cys Gly Asn Phe Gly Leu Val Gly Ser Ser Ser 
485 490 495 

Gly Gly He Gly Ser Tyr Asn Leu Gin Ser Gly Leu Leu Arg Lys Lys 

500 505 510 

Tyr Val Leu His Lys Gin Ala Val Thr Gly Leu Ala He Asp Gly Met 
515 520 525 

Asn Arg Lys Met Val Ser Cys Gly Leu Asp Gly He Val Gly Phe Tyr 
530 535 540 

Asp Phe Gly Lys Ser Val Tyr Leu Gly Lys Leu Gin Leu Glu Ala Pro 
545 550 555 560 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



!!• Thr S«jr K«t lie Tyr His Lys Ser Ser Asp Leu Val Ala Cys Ala 

565 570 575 

Leu Aap Aap Leu Ser He Val Val Il« Aap Val Thr Thr Gin Lys Val 

580 585 590 

He Arg He Leu Tyr Gly His Thr Aan Arg lie Ser Cly Met Asp Phe 
595 600 605 

Ser Pro Asp Gly Arg Trp He Val Ser Val Ala Leu Asp Ser Thr Leu 
610 615 620 

Arg Thr Trp Asp Leu Pro Thr Gly Gly Cys He Asp Gly Val He Leu 
625 630 635 640 

Pro He Val Ala Thr Ala Val Lys Phe Ser Pro He Gly Asp He Leu 
645 650 655 

Ala Thr Thr His Val Ser Gly Asn Gly Val Ser Leu Trp Thr Asn Arg 
660 665 670 

Ala Gin Phe Lys Pro Val Ser Thr Arg His Val Glu Glu Asp Glu Phe 
675 680 685 

Ser Thr He Leu Leu Pro Asn Ala Ser Gly Asp Gly Gly Ser Thr Met 
690 695 700 

Leu Asp Cly Phe Leu Asp Glu Asp Ser Asn Glu Aap Gly Thr He Asp 
705 710 715 720 

Glu Gin Tyr Thr Ser Ala Ala Gin He Asp Ala Ser Leu He Thr Leu 
725 730 735 

Ser Ser Glu Pro Arg Ser Lys Phe Asn Thr Leu Leu His Leu Asp Thr 
740 745 750 

He Lya Gin Gin Ser Lys Pro Lys Glu Ala Pro Lys Lys Pro Glu Asn 
755 760 765 

Ala Pro Phe Phe Leu Gin Leu Thr Gly Gin Ala Val Gly Asp Arg Ala 
770 775 780 

Ser Val Ala Glu Gly Lys Thr Ser Glu Gin Thr Asn Asn Thr Val Glu 
785 790 795 800 

Glu Thr Asn Ser Lys Leu Arg Lys Leu Asp Thr Asn Gly Asn His Ala 
805 810 815 

Phe Glu Ser Glu Phe Thr Lys Leu Leu Arg Glu Ala Gly Glu Ser Gly 
820 825 830 

Gin Phe Glu Arg Phe Leu Thr Tyr Leu Leu Asn Leu Ser Pro Ala Val 
635 840 845 

Leu Aap Leu Glu He Arg Ser Leu Asn Ser Phe Val Pro Leu Thr Glu 
850 855 860 

Met Thr Asn Phe He Gin Ala Leu Asn Ala Gly Leu Lys Ser Asn Ala 
865 870 875 880 

Asn Tyr Glu He Trp Glu Thr Leu Tyr Ala Met Phe Phe Asn He His 
885 890 895 

Gly Asp Val He His Gin Phe Glu Asn Glu Thr Ser Leu His Glu Ala 
900 905 910 

Leu Glu Glu Tyr Arg Gin Leu Asn Asp Glu Lys Asn Asn Lys Met Asp 
915 920 925 

Ser Leu Val Lys Tyr Cys Ala Ser He Val Ser Phe He Ser 
930 935 940 



55 



-23- 



EP 0 982 401 A2 



10 



20 



30 



35 



(2) INF0RK;VTI0N for SEQ id NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IS: 
AACCTGGCAA TTAACTCCCC GGCAAGTGAT AOCAGGMAT ACCIGTGTAT AGATTATAAT 60 
15 GGAACGCCGA TTTTTGCACT ATCACGCGTA ATAAGGACAG CAGTTGGACA TCGGTACATG 120 

AGAGAGCAAT GTAAGTCTTG ATAGTAATGA GCCGTGTTGA AGTAGTATTT TAATCTAATT 180 
TTACTCAAAA AAGGACAATG GAGATCTGGA GATAACACCA CACTAATCGG TTCTAGACAT 240 

300 
360 



(ii) MOLECULE TYPE: cDNA 
40 (iii) HYPOTHETICAL: NO 



AGACTAAGCC TGAAACGCGG TACTACAGCT TGTTTTGAAA AGGTTTGCGT TGTATAGGCA 
GTTAAATGTG TGTTTTTTTT GGGTAGAATT TGAGAAAAAG TTGACTGAAA AAAATGCAAG 
AAACGGGGTG ATCATGAAAA TAGACACACA CAAAAAGTCA AAAAACAATG GAAAAGCTTC 420 
AGAATAAGCA GTAGGACGTG TCTGAATTGA GTTTGTATTG TTATTTAGAG TTTTAAATTA 480 
25 GAGTTGTAAA TTTTTGGGTA GAATTTACGA AAAAGTCGAA CAAAAAAACG ACAAGTCAGG 540 

GTGATTGCAA AAAAACAGAA ACAATAGATA ATCTTAAATT AAGGTAGTAG AGGCTCTGTG 600 
AACTAATTTA GAGTTTAAAC ACGGGGGCAC GAGTCAGTGT TAGAGTTGTG AAGTTTATTT 660 
GGCTAGTGAA TTGACTGGCA AGATTGTTAA ACCTGGGGTA GAAAAAGACA ACGCATCGAC 720 

725 

AGGTT 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1144 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

45 CCATGATATA GAAATTGGTG GGTCAACGTA CTATCAAATT AACATAAAAC TACCACTTCG 60 

GTCATTCACG ATAAAGAAAC GCTACCTGCA ATTCCAGCAA TTGGTGCTGG ACTTGAGTCG 120 

TAATCTAGGC ATTGATAGTC GAGATTTTCC ATATGAATTA CCTGGGAAAC GGATCAACTG 180 

GCTTAACAAG ACCAGTATTG TTGAGGAGAG AAAAGTGGGA CTTGCAGAAT TTCTCAATAA 240 

50 

CCTCATTCAA GACTCAACAC TTCAGAATGA ACGAGAAGTG TTGTCGTTTT TGCAATTGCC 300 

GTCTAATTTT AGATTCACCA AGGATATGTT ACAGAATAAT CGAGCAGACT TGGATTCTGT 360 



55 
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TJkTATrCTlJk GTTGAAACTG GATATACTCA ACGAATCGTC 


420 


TAGCACCaTT 


jk ^ ^ * #• ^ ri» 


TAraTA^TTrf: tgaTCGCATT AGTCGGGTCT ACCAACCACG 


480 


GATTCTCuAC 




rTATTCGTAc: AGATAAAGAA GAGGCCCTAA AGAAGAAGCA 


540 


GTTGGTTTvC 




m/-&«-«P&*Pm#-& T^aTTTftTTa CTArVSQAAG TTCCCCGATC 


600 


AAAGAGGGTG 


TTvvvTCGtfw 


rkrwAAAAm. &&CfiCCACAC ACATTACCAT TAAACAATAA 


660 


AGAACTTCTT 


CAnCACCAAw 


xnnnikTxrm TPBlftarrBA fSAPAAARAAT TAfSArrAGrT 


720 


TAGGGTGTTA 


ATTGCCCGGC 


Br&&&r>kA&*P TCfiCCArtf!TA ATTAATGCAG AACTAGASGA 


760 


ACAlGAATGM 


ATGTT>»v>ATA 


«vzT«rT&&*PCi& &eASI3T(^dkC TACACGTCCA GCAAAATCAA 


840 


GCAAGCAAGA 


CGCAGiAGCTA 


»/"ii*^»«rii«i*«i» a«pa<~TaaTTT KTTCCf'TJLCT TCGATATTAT 


900 


CTGCCATTGA 


CGTTATTCTT 


GCAGGTTGGC CCAATTGTTC GTTTGAAACT TTTTCGAGGT 


960 


CTTCAGCCTC 


TAATGCCCTA 


TCTGAGCTCT CGCCATCGAG TTTCCAAAAC CCGCCGATAT 


1020 


TTTGAAAGAA 


TCTTTGAATG 


CCAAACCGTC GTGGCGGGAA CGATCTGCCT GCCTTGGCCA 


1080 


AGTTGAATAT 


GCTAGGGTGG 


TACTGTAAAT AGAAGACAGA TCCAATAAAC GTTCCTATAA 


1140 


ATGC 






1144 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 amino adds 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

His Asp He Glu He Gly Gly Ser Thr Tyr Tyr Gin He Aan He Lys 
15 10 15 

Leu Pro Leu Arg Ser Phe Thr He Lys Lys Arg Tyr Ser Glu Phe Gin 
20 25 30 

Gin Leu Val Ser Asp Leu Ser Arg Asn Leu Gly He Asp Ser Arg Asp 
35 40 45 

Phe Pro Tyr Glu Leu Pro Gly Lys Arg He Asn Trp Leu Asn Lys Thr 
50 55 60 

Ser lie Val Glu Glu Arg Lys Val Gly Leu Ala Glu Phe Leu Asn Asn 
65 70 75 80 

Leu He Gin Asp Ser Thr Leu Gin Aan Glu Arg Glu Val Leu Ser Phe 
85 90 95 

Leu Gin Leu Pro Ser Asn Phe Arg Phe Thr Lys Asp Met Leu Gin Aan 
100 105 110 

Asn Arg Ala Asp Leu Asp Ser Val Gin Asn Asn Trp Tyr Asp Val Tyr 
115 120 125 

Arg Lys Leu Lys Ser Asp He Leu Asn Glu Ser Ser Ser Ser He Ser 
130 135 140 
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Glu Gin lie 
145 

lie L«u Asp 



Lys Lys lys 



Leu Val Gin 
195 

Lys Glu Tht 
210 

His Gin Val 

225 

Acg Val Leu 



Glu Val Glu 



Asp Tyr Thr 

275 

lie Leu 
290 

(2) INFORMATION FOR SEQ ID NO: 18: 



Tyr Gin Pro Arg 
160 

Glu Glu Ala Leu 
175 

Zle Asp Aan Leu 

190 

Gly Gly Ala Val 
205 

Glu Leu Leu Gin 



Leu Asp Gin Leu 

240 

Leu lie Asn Ala 
255 

Asn Glu Glu Val 
270 

Arg Ala Lys Lys 
285 
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His He Arg Asp Arg He Ser Arg Val 
150 155 

Leu Val Arg Ala He Gly Thr Asp Lys 
165 170 

Gin Leu Val Ser Gin Leu Gin Glu ser 

180 185 

Glu Val Pro Arg Ser Lys Arg Val Leu 
200 

Pro Glu Thr Leu Pro Leu Asn Asn Lys 
215 220 

Gin He His Gin Asn Gin Asp Lys Glu 

230 235 



He Ala Arg Gin Lys Gin He Gly Glu 
245 250 

Glu Gin Asn Glu Met Leu Asp Arg Phe 
260 265 

Ser Ser Lys He Lys Gin Ala Arg Arg 
280 



(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 2736 base pairs 
(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
<0) TOPOLOGy: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc feature 
<B) LOCATION: 11 

(D) OTHER INFORMATION: /note- "N - G or A or T or C" 

(ix) FEATURE: 

<A) NAME/KEY: mi3C_£eature 

(B) LOCATION: 2723, .2724 

(D) OTHER INFORMATION: /note- "N - A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 2714. 72715 

(D) OTHER INFORMATION: /note- "H - A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 2710 " 

<D) OTHER INFORMATION: /note- "N - A or T Or C Or G 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION:27Q6. 72707 

(D) OTHER INFORMATION: /note- "N - A or T or C or G 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
ATGGAAAAAA NTTTGGCGAC TGTAAAGTTG TACACCGATT TGGACTGTGT TTTTAATTCA 
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AAC7ATCCAA 


CAAGAATTGT 


TTGGGGTGCT 


TCTTACAATT 


TTGGAATTCA 


ACAGATGATG 


1 9n 

12U 




GCAAACTTTG 


ATCGGTTTTC 


AAAACCACCA 


GTGGATCCAT 


CTACAAAATT 


AGGATTTTOG 






GATAAGTTAA 


AGTATATCTT 


ACATGGTAAA 


TCCCAAATCA 


GAACTAGGAA 


AAG7TTAGAA 


240 


5 


GTTGCATTTA 


AAGGATCAAG 


AGATCC6TAT 


GATTTGTTCA 


CGACTGCAGG 


CGGGTTTGTA 


300 




TTGTCATTTA 


GAAAGAATGT 


TGTCTGGGAC 


ATCAATAAAG 


ACGATAATTC 


GAAAAATTAC 


360 




TTCGATATCA 


CGGCAGATAA 


AGTTTCCTGG 


TATATTCCAA 


ACTATTTAGC 


AGGACCATTA 


42U 


10 


TTGGCTTGGA 


CAAGAAGTAG 


TAAAAATTCA 


ATTTATTTAC 


CAAATTCACC 


AAATGTGGTT 






AATTCTTGCT 


TTGCATATTA 


CCTTCAAGAT 


TTTACTGGAC 


AAGCTGATTT 


TGATCATGCT 






GCCCGAGTAT 


TTGAAAGAAA 


TGTGGTCAAT 


CTTAGTGGAG 


GAATTCATTT 




600 




TTTCTACTTG 


AACGTAAAGA 


TACAAATGGT 


AAGAGAACCG 


ATGAATTCAA 


ACCTCATTAI- 


660 


15 


GAAGTGCAGT 


TGTTTGATCC 


CAAGTATTGT 


GAGAAAGGAC 


ATGACTCTTA 


TGCTwGvTTw 


* * w 




CGAAGTCAAT 


TTATACATAT 


GGCTATCTCA 


TTGGAATCAA 


CAAACAGTTC 


AAGTTATAAT 


/OH 




ACAATCCATC 


TTAGTCCTGC 


TACTTTCCAA 


CAGTTTTTCG 


ATTGGTGGAA 


CTTATTTGCT 




20 


AGTAATATGC 


ACTTACCTAT 


TAGACGTGGC 


AAAATGTTTG 


GAGAAGCAAA 


AGAATCTGTC 






AAGTTTTCGC 


AACATTTATT 


CACAAACAAG 


TTTTCTTTCA 


TGTTGAAATC 


TTTGTTTATT 






GCTCATGTTT 


ATCGAGACGA 


AATTGTTGAT 


ATCAATAACG 


ATAGAATAGA 


AAGTATTGGT 






TTAAGAGCCA 


AAGTAGATGA 


TTTTATGGTT 


GATTTACATC 


AAAGAAAAGA 


GCCAGCAACC 


1 AAA 


25 


CTTTACCATG 


AAGAATTATC 


TAAGAATGAG 


AAGGTGATGA 


AAATGAATTT 


TGATTTAGGA 


1 i^u 




GAAGTCGTTT 


TATCAGGAAT 


AGACTTACGT 


GTCATGCATG 


TTTCATTTCT 


CCAAAATTTA 






TACACTCAAT 


CACATTCCAA 


TTCAGGTGAC 


GCTAAATCAA 


CTTATAATAT 


TTACGACAAT 


1 9 An 


30 


GATCATCGAT 


GGTTTGATAT 


TATGGA7TTC 


CAASAGGCAT 


ITTTGACATC 


AATTAAGGAT 






TGTGTCAGGA 


CAGTTGATAT 


TTATCCATTC 


ATGTATTTAC 


AAAGATTCTT 


TTATGAAAGA 


J. JDU 




GATACACATG 


GTGGCAACTC 


TGACGATGAG 


ACTGCATTTG 


GAAAAGAAGT 


TATTCATAAA 


1 ^ ^ w 




TGTAATTTGG 


GTGCCATGAA 


TCCCTTGGAA 


ACAAGATTGA 


ATGTATTGGT 


TCAAAGACTT 


ISOO 




AACGCTCTAC 


AAGAACAAGT 


CAAAAAATTG 


TCCAAAACAT 


CTGCTCCAGA 


ACCTCTiviWi 


1S60 




GATTTGAAAA 


AACGAATTCT 


GTTTTTGCAA 


AAAGAGATTA 


GCACAACCAA 


AjGCTuuC vT X 


1620 




AAGTCGAAAA 


TGCGTCGTAC 


ATCCACTATA 


AATGGTATGA 


ATAATTCTGA 




1680 


40 


AATAAGTTTA 


CTTTCTATAA 


CATGCTTCTT 


AAATGGAATT 


TCAATTGTCG 




1740 


TTGAAATACA 


TACATTTTGT 


GAAATTGAAA 


TCACAACTTC 


GAAATTACTT 


GT CAuAUAAl* 


1800 




TCCATTGAAA 


CACTTGAAAA 


AATGATCGAT 


AGTGTAAATG 


CATACAACGA 




1860 




TTGTCATCGA 


CGTCAGAAAT 


AATCCGTCGT 


TTCACACTGG 


aaggggttaa 


ATCACAGACA 


1920 


45 


TCTACCAGCA 


AAGATATCAC 


TTCACAACAG 


AAACTTGACA 


ATTTCAACAC 


AATATTACGA 


1980 




GAGACCAGAC 


CAGACGAAAA 


AGTGGTTGAG 


GATTATTTGA 


TTGACGTGAT 


CGCACCTCAA 


2040 




ATTCAATTAC 


AAAGTGAGGA 


TTATCCTGAT 


TCTGTTGTGC 


TCATCTCTAC 


ACCATCTATT 


2100 


50 


AAACGTAAAA 


TTTTGTCCAT 


TAGGGATTCC 


AGGAATAATG 


CAAACCAAAT 


CTTGTTAGAA 


2160 


ACTAGGTATG 


GTATTTTACT 


AAAAGATGCC 


AATCTTTTTC 


TATTAAACAA 


AGAGGATATT 


2220 
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GTAGGGTGTC CAGATAT TT AAGTATTAGT AATCCATATG GAGCTAAATC TAATTCGCCA 2280 

CCATGGCTAG GAACAGAAAT AACCCAAAAT GGTAAATGGG CTGGAGCCAA CAACTTATTC 2340 

ATTGAAAAGC TTTCTGTTAT GACAATGTGT TATGAAAGTG AAATTTTGTC AAGCAAGCTT 2400 

TCTCCAAATG CACAAGATCT GGATCAAGAA GACCAAGAAA ATTACAATGA TGATAATTCG 2460 

AAACAGGCTC CTCTTCGACT TGGTATTGAT ATCCCTTCTC TGGTGATTAC ATCTACATCA 2520 

AGTCAATACT TTACCTTATA TGTTATCATA GTGAGCTTCT TGTTTTATAG CGAGCCTATG 2580 

AGTAAAGTGA TCCACAAGAA AATCGAAAA6 ATGAAGTTTT CTATTGATTT CGAAGATTTG 2640 

GGTGCTCTTA C7AGCAGATT AACGAAAAT6 CAGCAACATC ATAAATTGTT GAAAGTATTG 2700 

TCTAANNACN AATNNTTTCC CGNNCGGGGG AATTAA 2736 
(2) INFORMATION FOR SEO ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 911 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(Xi) SEQUBNCB DESCRIPTION: SEQ ID NO: 19: 

Met Glu Lya Xaa Leu Ala Ser Val Lys Leu Tyr Thr Aap Leu Glu cys 
15 10 15 

Val Phc Aan Set Asn Tyr Pro Thr Arg lie val Trp Gly Ala ser Tyr 
20 25 30 

Asn Phe Gly lie Gin Gin Met Met Ala Asn Phe Asp Arg Phe Ser Lys 
35 40 45 

Pro Pro Val Asp Pro Scr Thr Lys Leu Gly Phe Trp Asp Lys Leu Lys 
50 55 60 

Tvr He Leu His Gly Lys Cys Gin He Arg Thr Arg Lys Ser Leu Glu 
65 70 75 80 

Val Ala Phe Lys Gly Ser Arg Asp Pro Tyr Asp Leu Phe Thr Thr Ala 
85 90 95 

Gly Gly Phe Val Leu Ser Phe Arg Lys Asn Val Val Trp Asp He Asn 
100 105 110 

Lys Asp Asp Asn Ser Lys Asn Tyr Phe Asp He Thr Ala Asp Lys Val 
115 120 125 

Ser Trp Tyr He Pro Asn Tyr Leu Ala Gly Pro Leu Leu Ala Trp Thr 
130 135 140 

Arg Ser Ser Lys Asn Ser He Tyr Leu Pro Asn Scr Pro Asn Val Val 
145 150 155 160 

Asn ser Cys Phe Ala Tyr Tyr Leu Gin Asp Phe Thr Gly Gin Ala Asp 
165 170 175 

Phe Asp His Ala Ala Arg Val Phe Glu Arg Asn Val Val Asn Leu Ser 
180 105 190 
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Gly Gly He His Phe Gin Val Gly Phe Leu Leu Glu Arg Lys Asp Thr 
195 200 205 

Asn Gly Lya Arg Thr Asp Glu Phe Lys Pro His Tyr Glu Val Gin Leu 
210 21S 220 

Phe Asp Pro Lys Tyr Cys clu Lys Gly His Asp ser Tyr Ala Gly Phe 
225 230 235 240 

Arg Ser Gin Phe lie His Met Ala He Ser Leu Glu Ser Thr Asn Ser 
245 250 255 

ser Ser Tyr Asn Thr He His Leu Ser Pro Gly Thr Phe Gin Gin Phe 
260 265 270 

Phe Asp Trp Trp Lys Leu Phe Ala Ser Asn Met Gin Leu Pro He Arg 
275 280 285 

Arg Gly Lys Met Phe Gly Clu Ala Lys Clu Ser Val Lys Phe Ser Cln 
290 295 300 

His Leu Phe Thr Ash Lys Phe Ser Phe Met Leu Lys Ser Leu Phe He 
305 310 315 320 

Ala His Val Tyr Arg Asp Glu He Val Asp He Asn Asn Asp Arg He 
325 330 335 

Glu Ser He Gly Leu Arg Ala Lys Val Asp Asp Phe Met Val Asp Leu 
340 345 350 

His Gin Arg Lys Glu Pro Ala Thr Leu Tyr His Glu Glu Leu Ser Lys 
355 360 365 

Aan Glu Lys Val Met Lya Met Aan Phe Asp Leu Gly Glu Val Val Leu 
370 375 380 

Ser Gly He Asp Leu Arg Val Met His Val ser Phe Leu Gin Asn Leu 
385 390 395 400 

Tyr Thr Gin Ser His Ser Asn Ser Gly Asp Ala Lys Ser Thr Tyr Asn 
' 405 410 415 

He Tyr Asp Aan Aap His Arg Trp Phe Asp He Met Asp Phe Gin Glu 
420 425 430 

AlA Phe Leu Thr Ser He Lys Asp Cys Val Arg Thr Val Asp He Tyr 
435 440 445 

Pro Leu Met Tyr Leu Gin Arg Phe Phe Tyr Glu Arg Asp Thr His Gly 
450 455 460 

Gly Lys Ser Glu Asp Glu Thr Ala Phe Gly Lys Glu Val He His Lys 
465 470 475 480 

Cys Asn Leu Gly Ala Met Asn Pro Leu Clu Thr Arg Leu Asn Val Leu 
^ 485 490 495 

Val Gin Arg Leu Asn Ala Leu Cln Clu Cln Val Lys Lys Leu ser Lys 
500 505 510 

Thr Ser Ala Pro Clu Pro Val Ala Asp Leu Lys Lys Arg He Ser Phe 

515 520 525 

Leu Gin Lys Glu He Ser Thr Thr Lys Ala Gly Val Lys Ser Lys Met 
530 535 540 

Arg Arg Thr Ser Thr lie Asn Gly Met Asn Asn Ser Glu Asn Tyr His 



545 



550 



555 



Asn Lys Phe Thr Phe Tyr Asn Met Leu Leu Lys Trp Asn Phe Asn Cys 
565 570 575 



.29. 



20 
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Ara Aan Leu Thr L«u Lya Tyr lie His Phe VaX Lya Leu Lya Ser Gin 
S80 585 590 

Leu Arg Aan Tyr Leu Ser Hia Lys Ser lie Glu Thr Leu Glu Lys Met 
59$ 600 605 

5 

Met Aap Ser V«l Asn Ala Tyr Asn Asp Lys Asp Asp Leu Ser Ser Thr 

610 615 620 

Ser Glu He He Arg Arg Phe Thr Ser Glu Gly Val Lys Ser Gin Thr 
625 630 635 640 

10 Ser Thr Ser Lys Asp He Thr Ser Gin Gin Lys Leu Asp Asn Phe Asn 

645 650 655 

Thr He Leu Arg Glu Thr Arg Pro Asp Glu Lys Val Val Glu Asp Tyr 

660 665 670 

Leu He Asp Val He Ala Pro Gin He Gin Leu Gin Ser Glu Asp Tyr 
15 675 680 685 

Pro Asp Ser Val Val Leu He Ser Thr Pro Ser He Lys Gly Lys He 
690 695 700 

Leu ser He Arg Asp Ser Arg Asn Asn Ala Asn Gin He Leu Leu Glu 

705 710 715 720 

Thr Arg Tyr Gly He Leu Leu Lya Asp Ala Asn Val Phe Val Leu Asn 
725 730 735 

Lvs Glu Asp He Val Gly Cys Pro Asp Met Leu Ser He Ser Asn Pro 
740 745 750 

25 Tvr Gly Ala Lys Ser Asn Trp Pro Pro Trp Leu Gly Thr Glu He Thr 

755 760 765 

Gin Aan Gly Lya Trp Ala Gly Ala Aan Aan Leu Leu He Glu Lya Leu 
770 775 780 

Ser Val Met Thr Met Cys Tyr Glu Ser Glu He Leu Ser Ser Lya Leu 
30 78S 790 795 800 

Ser pro Asn Ala Gin Asp Ser Asp Gin Glu Glu Gin Glu Asn Tyr Asn 
805 ' 810 815 

Asp Asp Asn Ser Lys Gin Ala Pro Leu Arg Leu Gly He Asp Met Pro 

820 825 B30 

Ser Val Val He Thr Ser Thr Ser Ser Gin Tyr Phe Thr Leu Tyr Val 
835 840 845 

He He Val Ser Leu Leu Phe Tyr Ser Glu Pro Met Ser Lya Val He 
850 855 860 

40 His Lys Lys He Glu Lys Met Lys Phe ser He Asp Phe Glu Asp Leu 

865 870 875 880 

Gly Ala Leu Thr Ser Arg Leu Thr Lya Met Gin Gin His His Lys Leu 
865 890 895 

Leu Lys Val Leu ser Xaa Xaa Xaa Xaa Phe Pro Xaa Arg Gly Asn 
45 900 905 910 

(2) IMPORMATIOM FOR SEQ ID HO: 20: 

(i) SEQUEHCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



35 



55 
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(iii) KYPOTKETZCAL: NO 



10 



(Xi) SEQUBNCB DBSCRZPTZON: 8BQ ZO NO: 20: 

AITCTTTCTT TGTTTCTTCA TTTTTGATCT CTTCTCTAfiA ATCACTCATT AATATTTGAT 60 

TCAGGGTTTT GATTTGCTAA ATAAGGGGTC TATTACGAGG ATATTATATA TAATGTGATG 120 

TGGCGAAAAA AAAAAACAAG ATCTACTACT CTGTTGGATT TATTTGTGAT GGCGATTGAA 180 

GAGAAAACAC GTCTTTTTAA CGCGTTTTTT TATTTTTTGG AGAAGCAAAT TTCAAGCAAA 240 

GACTCTTATT GTGTTGCTTT TGATCCATTC AAATTTTGTA TTACTTTTCA TPAGAACTAT 300 

AACTGTTCAT TATCAATGAC GTATACATGT CTOGTTCCTG TTATGTATTG TAATTTTAGT 360 

15 TAATTATAAG CCGTATATTG GTAGTATTCC TCTGTACTCA CAATGGAATT GGTCTTTCAA 420 

CAGCAACAAG TGTTATTTTC CCTGAATGTA GAAAATGAAA GGTAGTGTTT ACATATAGTT 480 

GGAAATCAAG CCTCTGARAT GAATCACAAT ATAATAACAA TTTGTAGTTG CAGAGAAAAA 540 

CAATTCAAGT TGACGGGTAG TTTTTTTTTT TTCACTGCAT TTTTCAACGA AAACTAAATA 

AAATTTCGCT CATATTGATA AAGTAT 

(21 INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 

(D) TOPOLOGY: linear 



20 



30 



iii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



600 
626 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



35 



45 



ATGGCGTCAA 


TTTCTGTTCC 


AATTGAARAA GGATCATTTC 


ACGATGGAGA TG(«ATTCAAT 


60 


CAACATCATT 


TAGGAGACCC 


AGTTATTTCA GGACCTCCCT 


ATATTATTAA ATTATTAAAC 


120 


TTACCCGTCA 


CAGCTAATGA 


TTCATTTGTC CAAGACTTGT 


TTCAAAGCAG ATTTACCCCA 


180 


TATGTCAAAT 


TTAAAATTGT 


AACAGACCCC GCATCAAATA 


TTTTGGAGAC TCATGTCATT 


240 


AGACAAGTGG 


CTTTTCTGGA 


ATTGGAATCG GCCAGTGATA 


TGTCAAAAGC TTTAAAATGG 


300 


CATGATTTGT 


ATTATAAGAC 


AAATAGAAGA GTAACTGTTG 


AAGTGGCAGA TTTTAATGAT 


360 


TTTCAAAATT 


GTATTAAATT 


CAATCAAGAA CAT(SAACGTG 


AAATTATCCA AATCCAACAA 


420 


GAATTCATTG 


CTCAGAAACA 


ACAACAACGG CAACCCAGAC 


ATATGGCTCT TTTAGATGAA 


480 


TTTGAAAGAA 


ACCAGXTGCGG 


TCCTGGATCA CCCTTGCATC 


AAAACCATGA TCACCACAAT 


540 


CCCCACCCAC 


AACAACAACA 


ACACCATCAT TTCAATCCTA 


ATTTAAACAG ACCTTCAGGT 


600 


AGATCAAGTC 


TTCCAATAGA 


TGAAACGTCT CATTCAAGAA 


GACTTTCTTT TG 


6S2 



(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 



55 
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(A) LENGTH: 217 amino acids 

(B) TYPE: amino acid 

(C) STRAHDEDNBSS: 

(D) TOPOLOGY: unknown 

(li) MOLECUU TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xl) SEQUBNCB DBSCRXPTIOK: SBQ ID NO: 22: 

Met Ala Ser He S«r Val Pro He Glu Lys Gly Ser Phe His Asp Gly 
15 10 15 

Asp Gly Phe Asn Sin Hia His Leu Gly Asp Pro Val He Ser Gly Pro 
20 25 30 

Pro Tyr He He Lys Leu Leu Asn Leu Pro Val Thr Ala Asn Asp Ser 
35 40 45 

Phe Val Gin Asp Leu Phe Gin Ser Arg Phe Thr Pro Tyr Val Lys Phe 
SO 55 60 

Lys lie Val Thr Asp Pro Ala Ser Asn He Leu Glu Thr His Val He 
65 70 75 80 

Arg Gin Val Ala Phe Val Glu Leu Glu Ser Ala Ser Asp Met Ser Lys 

85 90 95 

Ala Leu Lys Trp His Asp Leu ryr Tyr Lys Thr Asn Arg Arg Val Thr 

100 105 110 

Val Glu Val Ala Asp Phe Asn Asp Phe Gin Asn Cys lie Lys Phe Asn 
115 120 125 

Gin Glu His Glu Arg Glu He Met Gin He Gin Gin Glu Phe He Ala 
130 135 140 

Gin Lys Gin Gin Gin Arg Gin Pro Arg His Met Ala Leu Leu Asp Glu 
145 150 155 160 

Phe Glu Arg Asn Gin Arg Gly Pro Gly Ser Pro Leu His Gin Asn His 
165 170 175 

Asp His His Asn Pro His Pro Gin Gin Gin Gin His His His Phe Asn 
180 185 190 

Pro Asn Leu Asn Arg Pro Ser Gly Arg Ser Ser Leu Pro He Asp Glu 
195 200 205 

Thr Ser His Ser Arg Arg Leu Ser Phe 
210 215 

I INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1513 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDBONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPB: cDNA 
(iii) HYPOTHETICAL: HO 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 1492 " 
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(D) OTHER INFORMATION: /note- "N - A or G or C or T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



5 


GTAGTTTGT6 


AAGAAATTGA 


AACAATCGGA 


AAACAACAAT ATCAAACTGA TGCCCAATAA 


60 




CACTCTATGT 


ACCTAGATGG 


ATTACCAAGA 


TCTACTACAT AAAATAATAA AGGAGTTCCA 


120 




CTCACTCAAA 


GAGTTCAAAC 


CATGGGATAG 


CAGTGTTTTG TATGAGACGT TACTACGATC 


160 


10 


AGTATTAACT 


ACTTTGATCG 


AACTTTTGGG 


CATAGACAAT CCACCCAGTT ATCTACACCT 


240 


CACCACCAAC 


AATGATAGTA 


TAGGTGATTT 


GAAAATAAAA TACTATGGAA ATGCATTAAG 


300 




CAAGTCAATC 


AACGGTCATA 


GCATGTTGCA 


ATATCTTGAA TCAAAGCATG TATCGATATT 


360 




ACAGGCCGTG 


GTTGAGATTA 


TTAATACGCG 


ATCATATAGA ATCAAAGAGT CTTATTCTGC 


420 


15 


TGTTTTCAAA 


GACCTTTCTC 


ATTTATTTGA 


AAAACTACTA AAGGAAAGAT ATGAACCTGA 


480 




ATCTAATCTA 


GAGGATTATA 


TATTGCAGTG 


CTTGATGTAC GAGACCCAAT TTTACCAAGG 


540 




AATTGTTGAT 


AATGTTTTAA 


CTGCCGATGA 


CACCGAAAAA TTGGCTAGTT TTTTGGGGAC 


600 


20 


ACGACTATCT 


GAAGAAGATT 


CGATGTTTAG 


CTATAGGGAT ATAGATTATC CACTAGAGTT 


660 




AAACATTAAT 


AATGAATCTC 


TTGAAAAGAT 


ATATAAAATT TTCTTAGGAG TCATTGGCAC 


720 




CAAAAGATTC 


GATATCAAGG 


AGGTTGCGTC 


TGCTGTTCTT GGTGTGTATA AACGACACCA 


780 


25 


GAGAATA6AT 


CATTTTGAAA 


AGTTGGATTC 


AGATGAGATT TTGGGAAAGT TTTTCAGAAA 


840 


TATATTGCCA 


CAACTGTTCC 


AGAGTGTGAC 


AAATAAGGTT TTCCGGGAAT TTCACAAAGA 


900 




GGTAGATGAC 


CCACCATCGG 


ACGTGCTAGA 


CCACCTAGAT AATATTGTTC ATGACTTTAT 


9b0 




TGCGGTTGGA 


ATTGAAGGGG 


TAGATTTGGG 


CTTTCCGGCT TTGTTCAGAC ACTACATAAA 


1020 


OA 

30 


ATTCATGAAC 


GAAATTTTTC 


CCACTGTGGT 


CGAGGATGCT GACCGCGATT TTGTTGCAAG 


1080 




AATTAATAGT 


TTAATTGCTC 


AAGTCTTGGA 


GTTTAAAGAC GATGAAAAAT CCTGTGATAT 


1140 




CAATCAAGTG 


GTATCTGAAT 


TTGTTTCATT 


ACAAAGTTTG CTACTTAAGA ATAACTATCT 


1200 


35 


TTCACCATCT 


ACATTATTGA 


TGCGTGCAAG 


TACTCACGAT TACTATAAAA ATTTACAGAT 


1260 




CGTGAAAATA 


ACCTTTGATG 


GATGGAATGA 


GAATTCAAAG AGGATATTGA AATTGGAGAA 


1320 




CAGCGGCTTT 


TTACAAAGCA 


AGACATTGCC 


AAAGTATTTA AAATTATGGT ACTCAAAAAG 


1380 


40 


TATGAAGTTG 


AATGAATTAT 


GTAACCGGGT 


AGATGAATTT TATAATCGAG AACTTTCTCC 


1440 


GAAAGTTTTG 


GGCATTGTTG 


GGAGGGTCAC 


AACCAAAATG TCTATAAATC CNCAAAAATG 


1500 




GGAGGGTTGC 


TGA 






1513 



<2) INFORKATION FOR SEQ ID NO: 24: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 amino acids 

(B) TYPE: axaino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknovm 

50 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Asp Tyr Gin Asp Leu Leu His Lys He He Lys Glu Phe His Ser 
1 S 10 15 

Leu Lys Glu Phe Lya Pro Trp Asp Ser Ser Val Leu Tyr Glu Thr Leu 
20 25 30 

Leu Ara Ser Val Leu Thr Thr Leu Zle Glu Leu Leu Gly He Asp Asn 
35 40 45 

Pro Pro Ser Tyr Leu His Leu Thr Thr Asn Asn Asp ser He Gly Asp 
50 S5 60 

Leu Lys He Lys Tyr Tyr Gly Asn Ala Leu Ser Lys Ser He Asn Gly 
65 70 75 80 

His ser Met Leu Gin Tyr Leu Glu Ser Lya His Val Ser He Leu Gin 
85 90 95 

Ala Val Val Glu He He Asn Thr Arg Ser Tyr Arg He Lys Glu Ser 
100 105 HO 

Tyr Ser Ala Val Phe Lys Asp Val Ser His Leu Phe Glu Lys Leu Leu 
115 120 125 

Lys Glu Ara Tyr Glu Ala Glu Ser Asn Leu Glu Asp Tyr He Leu Gin 
130 135 140 

Cys Leu Met Tyr Glu Thr Cln Phe Tyr Gin Gly He Val Asp Asn Val 

145 150 155 160 

Leu Thr Ala Asp Asp Thr Glu Lys Leu Ala Ser Phe Leu Gly Thr Arg 
165 170 175 

Leu ser Glu Glu Asp Sec Met Phe Ser Tyr Arg Asp He Asp Tyr Pro 
180 185 190 

Leu Glu Leu Asn He Asn Asn Glu Ser Leu Glu Lys He Tyr Lys He 
195 200 205 

Phe Leu Gly Val He Gly Thr Lys Arg Phe Asp He Lys Glu val Ala 

210 215 220 

Ser Ala Val Val Gly Val Tyr Lys Arg His Gin Arg He Asp His Phe 
225 230 235 240 

Glu Lys Leu Asp Ser Asp Glu He Leu Gly Lys Phe Phe Arg Asn He 
245 250 255 

Leu Pro Gin Ser Phe Gin Ser Val Thr Asn Lys Val Phe Arg Glu Phe 
260 265 270 

His Lys Glu Val Asp Asp Pro Pro Ser Asp Val Leu Asp Gin Leu Asp 
275 280 285 

Asn He Val Asp Asp Phe He Ala Val Gly He Glu Gly Val Asp Leu 
290 295 300 

Gly Phe Pro Ala Leu Phe Arg His Tyr He Lys Phe Met Asn Glu He 
305 310 315 320 

Phe Pro Thr Val Val Glu Asp Ala Asp Arg Asp Phe Val Ala Arg He 

325 330 335 

Asn Ser Leu He Ala Gin Val Leu Glu Phe Lys Asp Asp Glu Lys Ser 
340 345 350 

cys Asp He Asn Gin Val Val Ser Glu Phe Val Ser Leu Cln Ser Leu 
355 360 365 

Leu Leu Lys Asn Asn Tyr Leu Ser Pro Ser Thr Leu Leu Met Arg Ala 
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370 375 380 

3er Thr His Aap Tyr Tyr Lys Asn Leu Gin He Val Lys lie Thr Phe 
385 390 395 400 

Asp Gly Trp Asn Glu Asn Ser Lys Arg Zle Leu Lys Leu Glu Asn ser 
405 410 415 

Gly Phe Leu Gin Ser Lys Thr Leu Pro Lys Tyr Leu Lys Leu Trp Tyr 
420 425 430 

Ser Lys Ser Met Lys Leu Asn Glu Leu Cys Asn Arg Val Asp Glu Phe 
435 440 445 

Tyr Asn Gly Glu Leu Cys Arg Lys Vel Leu Gly He Vel Gly Arg Vel 
450 455 460 

Thr Thr Lya Met Ser He Asn Xea Gin Lys Trp Glu Gly Cys 
465 470 475 

(2) INFORMATIOM FOR SEQ ID NO: 25: 

(1) SBQUENCS CHARACTERISTICS: 

(A) LENGTH: 436 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 25: 

AGTTATGTCT CATACATACA ACACAGATGA GGACATGTGT TTAAATGATA AATTGAAATA 60 

TTTGTACGAT TTATAATCGC TTTATCGTGA CAATTTCGAA TACTGGTACT TTCTACTCTA 120 

TTTGACAAAA ATTTGCAAAA AAXTGGGGAA AAAAATCCTG TTGCATTTTC GAGACCATCA 180 

GTTGCAACCA ATCTGAATAT ATTTTGACAC TTCAATAAAT CTAGTGAAAC TACTCGTCTA 240 

CTTTTTAATT CTAATCATCT CATAGTATAT CAAGCAAAGA CTTACTATGC GTTTATCAAA 300 

TTTAAGAAAA TGTAGACAGT ACGAAAATAC ACGAGTTTCC CAATCTTTGA ACTTGAAAAG 360 

ATAGTAATAC CGAGATTGGC CAAATCCTAC CCATAGTCCG TTCATACAAA TTCATGAACA 420 
ACATCTACAT AAGTAA 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA < 
(ill) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTTTCTTTCG AATTAGATTC AATCTTTTCC AATTTTGCTT GTACACTTGC TAGTTTGAAT 
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TTACGTTTTT 


CCTCTTTACG 


TTGTTTCACA ATGGCTGCAC GTTCTTCAAA ATTTATTCCC 


iZv 




TTCTTCTTGG 


TTGCTCTTAT 


ATCGTTCTCA TCTTCAGGCT TCCTCTCCTC TTGTAACCCT 


1 




TCTTTTTCTA 


ATAGTTTGAA 


ATACTTCTTT CTTAATCTAG CCCTATGGGT TAATGCACGT 




5 


TTTATATCTT 


GAGACTTGGC 


TTCTCGACGA TCTATAAATT TCITTTTTGA TTTAAATGAA 


300 




TTITTATTAT 


TTGGATGCAT 


TGTTGTGGAC GTGTATTTGA TAG6TTGATA ACTAGAAATA 


JDU 




AAAACTATGT 


GAAAGAACAA 


AATGCCAATC ACTAAAAAAA ATTTAAGATG AGTATSAAAT 


420 


10 


CAAAACTTTA 


CGACATCTTT 


GCGACATGCA CATTATGAGC GACATTTTGA TTCGATACCA 


460 




GAAATAGACA 


GATTTAGACA 


GGGTCTATAA CAGAGAAATC AACAATTAAC TGGTATCAAC 


540 




CTTAAGATTA 


AAAATGGTCT 


ATGGCGATAT GAACTGTTGT GATGAAAAAC AATATATTTG 


600 




GAAATACTTC 


TTTTCATTTC 


ACAATTTTTT ATAAAATTTT GGCAACAATT TTGTACCTAA 


660 


15 


AAATTCTTTT 


GTCTTCAAAA 


GTGAAATGTA ATATAGAAAT ACTATTACAA CCAAACA 


717 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 667 base pairs 
o/j (B) TYPE: nucleic acid 

(C) STRANDBDNBSS : single 
<D) TOPOIXX;y: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



30 



35 



40 



TTTAGTTTTA 


TATTGATGAT GTTTTTAAGT 


GCTTGTTTAT CATGGTGGAT GGAAATTAGA 


60 


ATGAGTAAAT 


TGAATGGAAA ATCACTGCAA 


CACCAACAAC AACCACTGGT GGATACGAAA 


120 


ATTTAGTGTA 


CAAATTTCTG CCAAAAAAAT 


ACAATAAAAA CCGCTTATAG TCTTCTACTG 


180 


ACATAACAAC 


ACAAGTCAAT AAATCAACAA 


CTCATAAACA ATGTAGACTT AATACTATCG 


240 


CTTAATTATT 


TAAACTATAA TAAATACCCT 


ATAGTATTAT CCCTTTGTCA ATGTGTGTAG 


300 


AATTTGGTTA 


TTACATATCC ATGTGTAATA 


TATATGTTGA TCAAAAAACG CGATCTTCTC 


360 


TTTGGTGTAG 


TGTCTTACAC AAAAAATTCA 


CTAGTCTAGG TCACATGATA ATCACGTGAA 


420 


AATCAAAAAT 


TTGTTGAAAT TGAATTTCCT 


CAATTTTGAA ATTTTGTTTG AAATTTTTTT 


480 


TTTGCTTTAC 


AAAAAGACTC CATTTTGTTT 


TCCATTTCAC AACCAATTAC TTAATTCCTC 


540 


TTTTTCATAA 


TTAATAACTA TCATTACTTA 


CAACTACAAA CAACTACGAT CATTTCCTAA 


600 


GAAAAAGCAA 


CGAGGGCGAA TTGAGACATT 


AATCCCCTTT ATTTTATCAT CATGCCTTAT 


660 


ACAGAAC 






667 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
AACTATTQCC AATGCTAAAT ATGCCACTGA AATCGWSAAT TTTAATAACT CGGTCCCTCT 
TAACCTCCCA TTCAAATTCA CTAATGCACA ATTGGATCTT TATGCTGCTA GCACACATAA 
CCAAGAGCCA ATATCCTAGT AACGACGCAC CATAGTAGAC CGAAT 
(2) INFOWIATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 base pairs 

(B) TYPE: nucleic acid • 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDHA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

<A) NAMB/KBY: misc_feature 

(6) LOCATION: 120 ^ ^ 

(D) OTHER INFORMATION: /note- -N - A or C or O or T 

(ix) FEATURE: 

(A) NAME/KEY: mi3C_feature 

(B) LOCATION: 129 ^ ^ 

(D) OTHER INFORMATION: /note- "N - A or T or C or C 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: 162 ^ ^ 

{X» OTHER INFORMATION: /note- -N • A or T or C or G 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 178 

(D) OTHER INFORMATION: /note- "N • A or T or C or G 

(ix) FEATURE: 

<A) NAME/KEY: misc_feature 
(B) LOCATION: 194 

(D) OTHER INFORMATION: /note- "N « A or T or C or G 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) L0CAT10N:195 ^ ^ r^m 
(D) OTHER INFORMATION: /note- -N - A or T or C or G 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION:199 

(D) OTHER INFORMATION: /note- -N - A or T or C or G 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION:203 ^ - 

(D) OTHER INFORMATION: /note- -N - A or T or C or G 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATGAACyVTTT CACCAGAGAC AGTAAATAAA CTACAACTOG ATGCATCGTG TATAAGAAAC 
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ATCTGTATTT TAGCACATGT CGACCACGGX AAAACCTCAT TGAGTGACTC ATTATTAGCN 120 

ACCAATGGMA TCATTTCCCA ACCTATGGCA CGTAAAGTTA GNTATCTTGA TTCGAGANGA 180 

GATGAACAAT TGANNGGTNT AANCATG 207 

(2) INFORMAnON FOR SEQ ID NO: 30: 

(1) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: €9 Amino acida 
(3) TYPE: a&lno acid 

(C) STRANDEDN8SS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptida 
ail) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Net Lys lie Ser Pro Glu Thr Val Asn Lya Leu Gin Ser Asp Ala Sar 
IS 10 IS 

Cys He Arg Aan He Cys He Leu Ala His Val Asp His Gly Lys Thr 
20 25 30 

Ser Leu Ser Asp Ser Leu Leu Xaa Thr Asn Xaa He He Ser Gin Arg 
35 40 45 

Met Ala Gly Lys Val Xaa Tyr Leu Asp Ser Arg Xaa Asp Glu Gin Leu 
50 55 60 

Xaa Gly Xaa Xaa Met 

65 

(2) INFORKATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2510 base pairs 

(B) TYPE: nucleic acid 
iO STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO ^ 

(ix) FEATURE: 

(A) NAME/KEY: mis cofeature 

(B) LOCATION:2481 

(OJ OTHER INFORMATION: /note- -N - A or T or C Ot G" 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
AAGTCATGCG ATTGCAACAA GGATCACAAG AACCAGAAGT TCACGAACAT TTGATTAATT 60 
TGATTGATTC ACCTGGGCAT ATTGACTTTT CGTCTCAAGT GAGTACTTCT TCGAGATTAT 120 
GTGATGGTGC AGTTGTTTTG GTCCATGTCG TCGAAGGTGT CTGCTCACAA ACAGTCAACG 180 
TTCTACGCCA ATGTTGGATT GATAAGTTGA AGCCATTACT AGTTATTAAC AAAATTGATA 240 
GGTTAATCAC AGAATGGAAA TTGTCTCCCT TCGAGGCATA CCAACACATT TCCAGAATTA 
TAGAACAAGT AAACTCTGTG ATTGGGTCAT TTTTTGCTGC TGATAGACTA GAAGATGACT 



300 
360 



TGAATTGGCG TGACGCTGGT TCTGTCGGGG ASTTTATCGA GAAGAGTGAT GAAGACTTGT 420 
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ATTTCACACC 


TGAAAAGAAT AATGTAATAT TTGCCTCGGC 


AATAGATGGA 


TGGGCATTTT 


480 




CAGTCAATAC 


ATTTGCCAAA ATATACCTGA AAAAATTAGG 


GTTCTCTCAA 


CAAGCATTG7 


540 


5 


CAAAAACTCT 


CTGGGGAGAC TTTTACTTGG ATATGAAAAA 


TAAAAAAATC 


ATCCCTGGTA 


600 




AAAAATTGAA 


AAATAATAGT AACAGTTTGA AGCCATTATT 


TGTTTCGTTG 


ATTTTGGACC 


660 




ACGTTTGGGC 


TGTTTATGAA AACTGTGTTA TTGAAAGAAA 


TCAAGACAAG 


TTGGAAAAAA 


720 




TCATTGAfiAA 


ATTAGGGGCC AAAATCACCC CTCGTGATTT 


GCGATCCAAA 


GATTACAAGA 


780 


10 


ACTTGCTAAA 


CTTGATTATC TCTCAGTGGA TTCCTTIGAG 


TCATGCCATA 


TTGGGGTCAG 


840 




TGATTGAATA 


CTTGCCAAGC CCCATTGTTG CTCAGCGTGA 


AAGAATAGAC 


AAGATTTTGG 


900 




ATGJVAACGAT 


TTATAGTGCA GTGGATTCAG AACtGAGATA 


AATCCAAACT 


AGTCGACCCT 




15 


TCATTTGTCA 


ASGCGATGCA GGAATGTGAT AGTTCACACC 


CGGAAACCCA 


TACAATAGCA 


1020 


TATGTATCAA 


AATTGTTGTC AATCCCCAAT GAAGACTTAC 


CCAAAGCTAG 


TAATGCCGCT 






ACTGGACXyVT 


TGACGGCCGA TGAAATCCAA GAACGACGAA 


GAATTGCTCG 


AGAATTAGCG 






AAAAAGOCAT 


CTGAAGCAGC TGCTTTGGCA CAAGAAGCTT 


CCAAAAATGA 


AGATGAGTTT 


1200 


20 


GCCATTAAAC 


CCAAGAAAGA TCCATTTGAA TGGGAATTTG 


AGGAGGACGA 


TTTTGAGAAT 


1260 




GAGGAAGATG 


AGAGCGATGC AAACGCAGTT GAAGAATCAA 


CTGAAACCAT 


AGTGGGTTTC 


1320 




ACTCCTATTT 


ATTCTGGATC GTTATCTAGA GGCCAAAAGC 


TCACGGTAAT 


TGGACCCAAA 


1360 


25 


TACGACCCTT 


CATTACCTAG AGACCATCAA ACCAACTTTG 


AACAAATAAC 


CAATGAAGTT 


1440 


GAAATTAAAG 


ACTTGTTTTT AATCATGGGA CGAGAATTAG 


TGAGAATGGA 


AAAAGTCCTG 


1500 




CGGGTAATAT 


TGTTGGGCTT GTTGGATTGG ATACGCCGTG 


CTTAAGAATG 


CCACAATTTG 






CTCACCGTTA 


CCTGAAGATA AACCATACAT TAATTTAGCT 


TCAACATCAA 


CCTTGATCCA 


ib2U 


30 


CAATAAACCA 


ATTATGAAAA TAGCAGTTGA ACCAACAAAC 


CCAATAAAAC 


TAGCAAAATT 


1680 




GGAACGAGGA 


TTAGATTTAT TGGCCAAAGC CGACCCGCTT 


TTGGAATGGT 


ATG7CGACGA 


1740 




CGAGTCAGGT 


GAATTGATTG TTTGTCTTGC TGGAGAATTG 


CATCTAGAAC 


GATGCTTGAA 


1800 


35 


AGATTTAGAA 


GAGAGATTCG CTAAGGGTTG TGAAGTTACC 


GTCAAAGAGC 


CAGTCATTCC 


1860 




CTTCAGAGAG 


GGGTTGGCAG ATGACAAAAT CAGTACCAAC 


ACCAATAATA 


ACAACGACGA 


1920 




CAATGAAGAT 


CATGAATTAG ATGAAAACGA AGATGAGCTT 


GCTGATTTAG 


AGTTTGATAT 


1 QAH 




TTCTCCGTTG 


CCATTAGAAG TGACTCAGTT TTTAATTGAG 


AATGAAACGA 


TTATTGCCGA 




40 


AATTGTCAAC 


AACAAGCAAG ATACTCATGA AATTAGAAAC 


GATTTTATTG 








CACTATTATT 


GATAATTCTA ATTTGGCTAC ACAATTTCCA 


GACACCAAGT 


CTTTTATwvi 






CAATATAATT 


TCCTTTGGAC CTAAACGTCT TGGGCCTAAT 


ATTTXC A A 1 \M 




2220 


45 


GTTAAACAAA 


TTTAGACATC TACTTGGTGA ATCTGCCACT 


GAATCTCGAT 


TTGTTTATGA 


2280 




GAATAATGTG 


TTCAATGCGG TTCAATTGGT ATTCAATGGG 


GGTCCGTTAG 


CATCAGACCC 


2340 




AATGCAAGGT 


ATTATTGTTA GACTTAAGAA GGCACAAAAA 


AGAGAAGTTG 


ACGAGGATAA 


2400 




GATAGTCAAC 


CCTGCTAAAA TAATCACACA GACTCGTGAC 


TTGATTTACA 


ACCGGTTTTT 


2460 


50 


GCAAAAATCA 


CCACGCTTGT NCCTTGCAAT GTATACGTGT 


GAAATCCAAG 




2510 




(2) INFORMATION FOR SBQ ID NO: 32: 
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(i) SEQUENCE CHARACTERISTJCS: 

<A) tENGTK: 310 amino acids 

(B) TYPE: amino acid 

(C) STRAKDEDKESS: 

(D) TOPOLOGY: unknown 

5 

(ii) MOLECULE TYPE: peptide 
tiii) HYPOTHETICAL: NO 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Val Met Arg Leu Gin Gin Oly Ser Gin Glu Pro Glu Val His Glu His 
15 10 15 

Leu lie Asn Leu He Asp Ser Pro Gly His He Asp Phe Ser Ser Glu 

15 20 25 30 

Val ser Thr Ser Ser Arg Leu Cys Asp Gly Ala Val Val Leu Val Asp 
35 40 45 

Val Val Glu Gly Val Cys Ser Gin Thr Val Asn Val Leu Arg Gin Cys 
50 55 60 

20 

Tro He Asp Lys Leu Lys Pro Leu Leu Val He Asn Lys He Asp Arg 
65 70 75 80 

Leu He Thr Glu Trp Lys Leu Ser Pro Leu Glu Ala Tyr Gin His He 
85 90 95 

25 Arg He He Glu Gin Val Asn Ser Val He Gly Ser Phe Phe Ala 

100 105 110 

Gly Asp Arg Leu Glu Asp Asp Leu Asn Trp Arg Glu Ala Gly Ser Val 
115 120 125 

Gly Glu Phe He Glu Lys Ser Asp. Glu Asp Leu Tyr Phe Thr Pro Glu 
30 135 140 

Lys Asn Asn Val He Phe Ala Ser Ala He Asp Gly Trp Ala Phe Ser 
145 ISO 155 160 

val Asn Thr Phe Ala Lys He Tyr Ser Lys Lys Leu Gly Phe ser Gin 
35 "5 170 175 

Gin Ala Leu Ser Lys Thr Leu Trp Gly Asp Phe Tyr Leu Asp Met Lys 
180 185 190 

Asn Lys Lys He He Pro Gly Lys Lys Leu Lys Asn Asn Ser Asn Ser 

195 200 205 

^ Leu Lys Pro Leu Phe Val Ser Leu He Leu Asp Gin Val Trp Ala Val 

210 215 220 

Tyr Glu Asn Cys Val He Glu Arg Asn Gin Asp Lys Leu Glu Lys He 
225 230 235 240 

45 He Glu Lys Leu Gly Ala Lys He Thr Pro Arg Asp Leu Arg Ser Lys 

245 250 255 

Asp Tyr Lys Asn Leu Leu Asn Leu He Met Ser Gin Trp He Pro Leu 
260 265 270 

Ser His Ala He Leu Gly Ser Val He Glu Tyr Leu Pro Ser Pro He 
50 275 280 285 

Val Ala Gin Arg Glu Arg He Asp Lys He Leu Asp Glu Thr Il« Tyr 
290 295 300 
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Ser Ala Val Aap Ser Glu 
30S 310 

(2) INFORMATION F B SBQ ID NO: 33: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 amino acids 

(B) TYPE: anino acid 

(C) STRANDBDNBSS: 

CO) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 33; 

Asp Lys ser Lys Leu Val Asp Pro Ser Phe Val Lys Ala Met Gin Glu 
1 5 10 1^ 

Cvs Asp ser Ser His Pro Glu Thr His Thr He Ala Tyr Val ser Lys 

Leu Leu Ser lie Pro Asn Glu Asp Leu Pro Lys Ala Ser Asn Ala Ala 
35 40 45 

Thr Gly Gly Leu Thr Ala Asp Glu He Gin Glu Arg Gly Arg He Ala 
50 55 60 

Arg Glu Leu Ala Lys Lys Ala Ser Glu Ala Ala Ala Leu Ala Gin Glu 
65 70 75 80 

Gly Ser Lys Asn Glu Asp Glu Phe Ala He Lys Pro Lys Lys Asp Pro 
85 90 95 

Phe Glu Trp Glu Phe Glu Glu Asp Asp Phe Glu Asn Glu Glu Asp Glu 
100 105 110 

Ser Asp Ala Asn Ala Val Glu Glu Ser Thr Glu Thr He Val Gly Phe 
115 120 125 

Thr Arg He Tyr Ser Gly Ser Leu Ser Arg Gly Gin Lys Leu Thr Val 
130 135 140 

He Gly Pro Lys Tyr Asp Pro Ser Leu Pro Arg Asp His Gin Thr Asn 
145 150 155 160 

Phe Glu Gin He Thr Asn Glu Val Glu He Lys Asp Leu Phe Leu He 
165 170 175 

Met Gly Arg Glu Leu Val Arg Met Glu Lys Val Ser 
100 185 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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Ui) SEQUENCE DESCRIPTION; SBO 10 NO: 34: 

Gly Aan He Val Gly Val Val Cly L«u Asp Xaa Ala Val Leu Lys Aan 
1 5 10 15 

Ala Thr He Cys Ser Pro Leu Pro Glu Asp Lys Pro Tyr He Asn Leu 
20 25 30 

Ala Ser Thr Ser Thr Leu He His Aan Lys Pro He Met Lys He Ala 
35 40 45 

Val Glu Pro Thr Aan Pro He Lys Leu Ala Lys Leu Glu Arg Gly Leu 
50 55 60 

Asp Leu Leu Ala Lys Ala Asp Pro Val Leu Glu Trp Tyr Val Asp Asp 
65 70 75 flO 

Glu Ser Gly Glu Leu He Val Cys Val Ala Gly Glu Leu His Leu Glu 
85 -90 95 

Ara CVS Leu Lys Asp Leu Glu Glu Arg Phe Ala Lys Gly Cys Glu Val 
100 105 HO 

Thr Val Lys Glu Pro Val He Pro Phe Arg Glu Gly Leu Ala Asp Asp 
115 120 125 

Lys He ser Thr Asn Thr Asn Asn Asn Asn Asp Asp Asn Glu Asp His 
130 135 140 

Glu Leu Asp Glu Asn Glu Asp Glu Leu Ala Asp Leu Glu Phe Asp He 
145 ISO 155 160 

Ser Pro Leu Pro Leu Glu Val Thr Gin Phe Leu He Glu Asn Glu Thr 
165 170 175 

He He Ala Glu He Val Asn Asn Lys Gin Asp Thr His Glu He Arg 
180 185 190 

Asn Asp Phe He Glu Lys Phe Ala Thr He He Asp Asn Ser Asn Leu 

195 200 205 

Ala Thr Gin Phe Pro Asp Thr Lys Ser Phe He Asn Asn He He Cys 
210 215 220 

Phe Gly Pro Lys Arg Val Gly Pro Asn He Phe He Glu Asp Tyr Gly 
225 230 235 240 

Leu Asn Lys Phe Arg His Leu Leu Gly Glu Ser Ala Thr Glu ser Arg 
^ 245 250 255 

Phe Val Tyr Glu Asn Asn Val Phe Asn Gly Val Gin Leu Val Phe Asn 
260 265 270 

Gly Gly Pro Leu Ala Ser Glu Pro Met Gin Gly He He Val Arg Leu 
275 280 285 

Lys Lys Ala Glu Lys Arg Glu Val Asp Glu Asp Lys He Val Asn Pro 

290 295 300 

Gly Lys He He Thr Gin Thr Arg Asp Leu He Tyr Lys Arg Phe Leu 
305 310 315 320 

Gin Lys ser Pro Arg Leu xaa Leu Ala Met Tyr Thr Cys Glu He Gin 
325 330 335 



INFORMATION rOR SBQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 841 base pairs 

(B) TYPE: nucleic acid 
<C) STRAHDEDNESS: single 
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<D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

5 

(Ix) 

10 (ix) 
(ix) 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3S: 
CGCGAACNNT CAATCATNTC AGAAGAAATG AAAGAAGCTA 

20 

GCAAGAATCC CTGTGATTGA GGCATTTGGC TTTTCCGAGG 
GGGGCAGCTA GTCCTCAATT AGTTTTTGAT GGGTATGATA 
TGGGTTCCAC ATACTGAAGA AGAATTAGAA GAATTGGCTO 
25 GTTGCTAGAA GATATATGAA TAATATCAGA AGAAGAAAAG 

GTCCTCAAAA ATGCTGAAAA GCAAAGAACT TTGAAAAGAG 
AGGCAATATG TGTGAAATTG TTACAGAAAA GACACATACG 
ATATTCAACA ACAAGTAAAT GTATTGATAT AGATGTATAA 

30 

ATCCGAATAC ACATAGACAC ACAACTCAGC CTGTCAGGGC 
ATACTAAAAT CCATCCACAC TTCTCCTAAT TCTACGGAAG 
AAAAATAATA ATTCTATCAC ACTTTGAAAA TTTGATTGAA 
35 AACATTACTC TTTTCAAACA ACGAGATCCA AATACTGCAC 

TACATCACTA TAGTTTTCTA TTGTTCTAAC ATCAATACAG 
AAATAATTGA TTGCAATTTG CCAAACTAGA AAACAAAGAC 
A 

40 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acida 

(B) TYPE: amino acid 

(C) STRANOEDNESS: 

45 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



FEATURE: 

(A) NAHB/KBY: misc^feature 

(B) LOCATION: 0 

(D) OTHER INFORMATION: /note- "N « A or T or C or G" 

FEATURE: 

(A) NAME/KEY: miflc_£eaturo 

(B) LOCATION: 9 

(D) OTHER INFORMATION: /note- "N - A or T or G or C" 

FEATURE: 

(A) NAME/KEY: miac feature 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /note- "N - A or T or C or G 



CTCCGTTCTT TACTATTGTG 60 

ATATTAGAAA GAAGACATCC 120 

TGTTAGATAT CGATCCATTT 180 

AATTTGCAGA AAGAGAAAAT 240 

GGTTATTTGT TGATGAGAAA 300 

ATTAGATTAT CCAGTAAAAC 360 

ATGTGGCCAT TATTTGTTTA 420 

TATA6TCAAA TGTTGAGACT 480 

TGTTTATTAA GTTGTGATGT 540 

AATTACAAAA AAGATCACAT 600 

GGTGTTACTA GTATTGTTTC 660 

AATCTTCAAA CGAACGGAGT 720 

ACAAAAAGAA AGTGTAGCAT 780 

GAAAAAAAGA AAAAAATTTC 840 
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Aro Glu Xaa Ser lie Xaa Ser Glu Glu Mat tys Glu Gly Thr Pco Ph« 
15 10 15 

Phe Thr lie Val Ala Arg He Pro Val He Glu Ala Phe Cly Phe Ser 
20 25 30 

Glu Aap He Arg Lys Lys Thr Sar Gly Ala Ala Ser Pro Gin Leu Val 
35 40 45 

Phe Asp Gly Tyr Asp Met Leu Asp He Asp Pro Phe Trp Val Pro His 
50 55 60 

Thr Glu Glu Glu Leu Glu Glu Leu Gly Glu Phe Ala Glu Arg Glu Asn 
65 70 75 80 

Val Ala Arg Arg Tyr Met Asn Asn He Arg Arg Arg Lys Gly Leu Phe 
85 90 95 

Val Asp Glu Lys Val Val Lys Asn Ala Glu Lys Gin Arg Thr Leu Lys 
100 105 110 

Arg Asp 

(2) INFOBMATION rOR SBQ ID NO: 37: 



2Q <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 564 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 
(O) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 



ixi) SEQUENCE DESCRIPTION: SBQ ID NO: 37: 
AACCTAAAAA TGGCTAAGTT CATCAAATCT GGTAAAGTTG CTATTGTTGT AAGAGGTCGT 60 
TACGCTGGTA AAAAAGTAGT CATTGTGAAA CCACATGATG AAGGTACCAA ATCTCACCCA 120 
TTCCCACATG CCATTGTCCC TGGTATTGAA AGAGCTCCAT TGAAGGTTAC CAAGAAGATG 180 

2^ GftTGCTAAAA AAGTTACCAA AAGAACTAAA GTCAAGCCAT TTGTTAAATT AGTAAACTAC 240 

AACCATTTAA TGCCAACTAG ATACTCATTG GATGTTGAAT CATTCAAATC TGCTGTCACT 300 
TCTGAAGCTT TAGAAGAACC ATCTCAAAGA GAAGAAGCTA AAAAAGTTGT CAAGAAGGCT 360 
TTTGAAGAAA AACATCAAGC TGGTAAGAAC AAATGGTTCT TCCAAAAATT ACACTTTTAA 420 

40 GAAAGGAACC ACCTTTATTT GAATGTTTGT AATATAGGTT GAATCAGAGA GACAAAGTAG 480 

AAGAAAATAC AAAAAAGACA GTATATCTGT ATAGTATAAT TTAATGGGGG TCTAATTTAC 540 
TTACCACTTT ATTCGTGCAT TATT 
(2) INFORMATION FOP SEQ ID NO: 38: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPB: amino acid 

(C) STRANDEDNBSS: 

(D) TOPOLOGY: unknown 

^ (ii) MOLECULE TYPB; peptide 

<iii) HYPOTHETICAL: NO 



55 
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(Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 36: 

Met Ala Lya Phe He Lya Ser Cly Lys Val Ala He Val Val Arg Gly 
15 10 15 

Arg Tyr Ala Gly Lys Lys Val Val 11m VaX Lys Pro His Asp Glu Gly 
20 25 30 

Thr Lys Ser His Pro Phe Pro His Ala He Val Ala Gly lie Glu Arg 
35 40 45 

Ala Pro Leu Lys Val Thr Lys Lys Met Asp Ala Lys Lys Val Thr Lya 
SO 55 60 

Aro Thr Lys val Lys Pro Phe Val Lys Leu Val Asn Tyr Asn His Leu 

65 70 75 eo 

Met Pro Thr Arg Tyr Ser Leu Asp Val Glu Ser Phe Lys Ser Ala Val 
85 90 95 

Thr Ser Glu Ala Leu Glu Glu Pro Ser Gin Arg Glu Glu Ala Lys Lys 
100 105 110 

Val Val Lys Lys Ala Phe Glu Glu Lys His Gin Ala Gly Lys Asn Lys 
115 120 125 

Trp Phe Phe Gin Lys Leu His Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ili) HYPOTHETICAL: NO 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TTTGAAACGA TTAAGTCCAA TCAAACAATC TTATTCAAAA GTACTCGCAA TACGTACAAT 
GTCAATTCCA TCTACTCAGT ACGGATTTTT TTATAATAAA GCTAGTGGTC TTAATTTGAA 
AAAAGACTTG CCGGTTAACA AGCCAGGTGC TGGTCAATTG CTTTTAAAGG TTGATGCAGT 
TGGCCTTTGT CATTCAGATT TACATGTTCT CTATGAAGGT TTGGATTGTG GTGATAATTA 
TGTGATGGGC CACGAAATTG CTGGGACTGT TGCTGAACTA GGTGAAGAGG TGAGTGAGTT 
TGCAGTTGGA GATCGTGTCG CTTGTGTCGG CCCCAATGGA TGTGGTCTTT GTAAACACTG 
TCTTACTGGT AACGATAATG TTTGTACCAA GTCGTTTTTG GATTGGTTTG GATTGGGTTA 
CAATGGAGGT TACGAGCAAT TTTTGTTAGT CAAGAGACCA AGAAACTTGG TCAAGATCCC 
TGACAATGTT ACTTCCGAGG AAGCTGCAGC TATTACGGAT GCCGTATTGA CTCCTTACCA 
TGCTATCAAG TCTGCAGGTG TTGGTCCAGC AAGTAATATA TTAATTATCG GAGCTGGTGG 
ATTAGGAGGT AACGCTATTC AAGTTGCAAA AGCATTTGGT GCGAACGTTA CTGTTTTGGA 
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TAAAAW3GAT AAGGCAAGAG ACCAAGCTAA GGCCTTTGGA GCTGACCAGG TTTACAGTGA 720 
ATTACCAGAC AGCCTTTTAC CTGGOTCATT CAGTGCTTGT TTTGATTTTG TTTCGGTTCA 780 
GGCAACATAC GATTTGTGTC AAAAGTATTG TGAGCCAAAG CGTACTATTC TTCCCGTACG 840 

5 

TCTACOTGCA ACTTCGCTTA ACATAAATCT TOCTGATTTA GATCTTCGTG AAATTACCGT 900 
CAAGGGCTCA TTCTGGGGTA CCCTGATGGA TTTAAGAGAA GCATTTGAAT T6GCTGCACA 960 
GGGfMGGTC AAACCAAATG TTGCTCATGC TCGATTGTCA GAATTGCCTA AGTATATGGA 1020 
10 GAACTTGAGA GCCCCTCCTT ATGAAGGAA6 AGTCGTCTTT AATCCATAAT ACTGAAAAGT 1080 

GAAGAAACCA TCAATAATAG CTTGCTGAGT ATGTATGGGA AATATTCATT TATGTATGTA 1140 
GGTCATTTAT ATGTGTCTAA TGATTTCTAA TCTGAATTTC GTACAATTCT TT 1192 
(2) INroRMATIOM FOR SEQ 10 NO: 40:- 

15 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: anino acid 

(C) STRANDEDNB5S: 

(D) TOPOLOGY: unknown 

20 (ii) MOLECULE TYPE: peptida 

Uii) HYPOTHETICAL: NO 



30 



25 (xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 40: 

Met Ser lie Pro Ser Thr Gin Tyr Gly Phe Phe Tyr Asn Lya Ala Ser 
15 10 15 

Gly Leu Asn Leu Lys Lys Asp Leu Pro Val Asn Lys Pro Gly Ala Gly 

20 25 30 

Gin Leu Leu Leu Lys Val Asp Ala Val Gly Leu Cys His Ser Asp Leu 
35 40 45 

His Val Leu Tyr Glu Gly Leu Asp Cys Gly Asp Asn Tyr Val Met Gly 
50 55 60 

OK His Glu He Ala Gly Thr Val Ala Glu Leu Gly Glu Glu Val Ser Glu 

65 70 75 80 

Phe Ala Val Gly Asp Arg Val Ala Cys Val Gly Pro Asn Gly Cys Gly 
85 90 95 



40 



45 



50 



55 



Leu Cys Lys His Cys Leu Thr Gly Asn Asp Asn Val Cys Thr Lys Ser 
100 105 110 

Phe Leu Asp Trp Phe Gly Leu Gly Tyr Asn Gly Gly Tyr Glu Gin Phe 
115 120 125 

Leu Leu Val Lys Arg Pro Arg Asn Leu Val Lys He Pro Asp Asn Val 

130 135 140 

Thr Ser Glu Glu Ala Ala Ala He Thr Asp Ala Val Leu Thr Pro Tyr 
145 ISO 155 160 

His Ala He Lys Ser Ala Gly Val Gly Pro Ala Ser Asn He Leu He 
165 170 175 

He Gly Ala Gly Gly Leu Gly Gly Asn Ala He Gin Val Ala Lys Ala 
180 185 190 

Phe Gly Ala Lys Val Thr Val Leu Asp Lys Lys Asp Lys Ala Arg Asp 
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195 200 205 

Gin Ala Lya Ala Phe Gly Ala Asp Gin Val Tyr Sar Glu Leu Pro Asp 
210 215 220 

Sar Val Lau Pro Gly Sar Phe Sar Ala Cys Phe Asp Pha Val Ser Val 
225 230 235 240 

Gin Ala Thr Tyr Asp Leu Cys Gin Lys Tyr Cys Glu Pro Lys Gly Thr 
245 250 255 

lie Val Pro Val Gly Leu Gly Ala Thr Sar Leu Asn He Aan Leu Ala 
260 265 270 

Asp Leu Asp Leu Arg Glu He Thr Val Lys Gly ser Phe Trp Gly Thr 
275 280 285 

Ser Met Asp Leu Arg Glu Ala Phe Glu Leu Ala Ala Gin Gly Lys Val 
290 295 . 300 

Lvs Pro Asn Val Ala His Ala Pro Leu Ser Glu Leu Pro Lys Tyr Met 

305 310 315 320 

Glu Lys Leu Arg Ala Gly Gly Tyr Glu Gly Arg Val Val Phe Aan Pro 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2021 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDSONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ili) HYPOTHETICAL: NO 

30 (ix) FEATURE: 

(A) NAME/KEY: &isc_feature 

(B) LOCATION: 1270 

(D) OTHER INFORMATION: /note- "R - A or G* 

(ix) FEATURE : 

(A) NAME/KEY: misc^f eature 
35 (B) LOCATION: L395 

(D) OTHER INFORMATION: /note= "R - A or G" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATGGAAAAAA TTGACATTAA TACAAATTCA AACAAAATCC AACAAGCATA CGATAAAGTT 60 

GTTAGRGGAG ACCCAAATGC AACATTCGTC GTTTATTCTG TTGACAAAAA CGCCACTATG 120 

GACGTCACTG AAACAGGGGA CGGATCATTA GAGGATTTTG TTGAACATTT TACT6ATGGA 180 

CAAGTTCAAT TTGCTTTAGC CAGGGTTACT GTTCCAGGAT CTGACGTTTC CAAAAACATC 240 

4S TTGTTAGGAT GGTGTCCTGA CAGTGCTCCA GCAAAATTGA GATTGTCATT TCCCAATAAT 300 

TTTGCTGATG TGTCCAGACT ATTGAGCGGA TACCATGTGC AAATTACTGC AAGGGATCAA 360 
GATGATTTAG ACGTGAATGA ATTCTTGAAT AGAGTTGGTG CTGCTGCTGG TGCAACATAT 
TCCACTCAAA CTTCCGGACT CAAAAAACCA TCCCCTGCTG CACCTAAACC TACTTCAAAA 



10 



15 



20 



25 



420 
480 



50 



CCTCTTGTTG CTAAATCTAG TTCTGCTTCA AAACCTTCAT TTGTACCCAA ATCTACTGGC 540 
AAGCCTGTTG CTCCAGCTAA GCCAAAACCA AAGAACATCA CCAAGGATGC TGGTTGGGGT 600 



55 
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5 



15 



20 



25 



30 



35 



GATGCTGAAG 


ACGTTGAGGA 


AAGAGACTTT 


GACAAGAAAC 




€60 


GCATATAAAC 


CAACAAAGGT 


TAACATTGAC 


^ ft ft flMIl^ ft ^ ft ft 

GAATTGAGAA 


ft ftr-ftft ftft ftff AKJXTHfHarT 


720 


AGCTCAACTC 


CTAAAACATT 


CAAATCTGAA 


^#«ft^ftftr*ftft^ 




780 


CAATCCAAAC 


CTTTATCGGA 


AACGATGAAA 


GCCTATGaTw 




640 


AGATTGACTT 


CTTTACCAAA 


ACCAAAGATT 


ft ^ft*IUI^<l<IA 




900 


AGTGCATCT6 


GGAATGGTGC 


TGCTCCTGCG 


TXTuivXuCT/i 


jmrCJUSCATT TG6TACACAA 


960 


TCAGTTGATT 


CAAGAAAGGA 


TAIWWXTuvTA 


GGT6GTTTGT 


ggftCACATTT TGGTGCTGAA 


1020 


AATGGAAAAA 


CTCCGGCACA 






fiAAAATACAA AACAGTGGCC 


1080 


TCCGATGAGA 


AAGAAACTAA 






AiGCCAGAGGA ACATCATGCT 


1140 


GCCGACTtGG 


CCAAAAAATT 






CTGGCGATAC TCCTTCCTTG 


1200 


CCAACTAGAA 


ACTTACCACC 


AGCACCACCA 


^»»ft^r»ftr;ftftft 


rCGCAATTCC ATCTAACGAA 


1260 


AAAGACAAAR 


AAGAAAAGGA 


AGAGGAAGAA 




rappATCTTT fyCTACTAGA 


1320 


AACTTACCAC 


CACCGTCACA 


AAGACAACCT 




facAAf^CAGA. ACaAAfiAGGAG 


13d0 


GAAGAAGAAG 


AAGARGAGGC 


MM AH A^ Ml A ^ 

TCCTGCTCCA 


AGCTTACCAEv 




1440 


CCAAAAGCAG 


AAGCAGMGA 


ATCAAAAAAA 


M|M ft ft MM ft 

CAGTvAACCa 




1500 


TACGAAAAGG 


ACGAAGATAA 


TGAAATTGGA 




fSTftACTTCAT TATTGATATT 
wi uiniu X X utr%x *^x*wn**ft** 


1560 


GAATTTGTGG 


ATGACGATTG 


GTGGCAAGGT 


ft & ft & 


AAACTGGTGA AGTTGGTTTG 


1620 


TTTCCTGCCA 


CITATGTGTC 


ft HMItlt ft ft f|«^ ft ft 

ATTAAATGilWV 


a & H / VT/IPT/t 
/vUiUV> 1 WW J> UP 


ACAAAGAAGA GGAAGCCCCA 


1680 


GCTCCAGCTC 


CAGCGCCATC 


ATTACCTTCT 


AGAGAAGAAA 


^ ft M« ft ft ft ft /^/^ ft 'I"!'' & 

CACAAGCAGC ACCAGCATaA 


1740 


CCAAGTAGAT 


CAGACCAAAA 


ACCAGAATCA 


AAAACTGCTA 


CAGCTGAATA CGATTACGAA 


1800 


AAGGACGAAG 


ACAATGAAAT 


TGGTTTTICA 


GAAGGTGATT 


TCATTGTTGA AATCGAATTT 


1860 


6TTGACGATG 


ATTGGTGGCA 


AGGAAAACAT 


TCCAAGACAG 


GAGAAGTCGG ATTGTTCCCT 


1920 


GCTAACTATG 


TTGTCTTGAA 


TGAGTAGATT 


TAGTATAAAC 


AATATTCGTT TTTTTTTTAT 


19S0 


ATGAATCTAT 


AATATAAATA 


CAAAGAAAAG 


ATAAATTGGT 


G 


2021 



(2) INFOPMATION FOR SEQ ID MO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 amino acids 

(B) TYPB: anino acid 
40 (C) STRANDEDNES5 : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(ill) HYPOTHETICAL: NO 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Glu Lya lie Asp He Asn Thr Aan Ser Aan Lys He Gin Gin Ala 
15 10 15 

50 

Tyr Asp Lya Val Val Arg Gly Aap Pro Asn Ala Thr Phe Val Val Tyr 
20 25 30 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



Sec val Aap Lys Asn Ala Thr Met Asp val Thr Glu Thr Gly Asp Gly 
35 40 45 

Ser Leu Glu Aap Phe Val Glu His Phe Thr Asp Gly Gin Val Gin Phe 
50 55 60 

Gly Leu Ala Arg Val Thr Val Pro Gly Ser Asp Val Ser Lys Asn He 

65 70 75 eo 

Leu Leu Gly Trp Cys Pro Asp Ser Ala Pro Ala Lys Leu Arg Leu Ser 
85 90 95 

Pha Ala Asn Asn Phe Ala Asp Val Ser Arg Val Leu Ser Gly Tyr His 

100 105 110 

Val Gin He Thr Ala Arg Asp Gin Asp Asp Leu Asp Val Asn Glu Phe 
lis 120 125 

Leu Asn Arg Val Gly Ala Ala Ala Gly Ala Arg Tyr Ser Thr Gin Thr 
130 135 140 

Ser Gly Leu Lys Lys Pro Ser Pro Ala Ala Pro Lys Pro Thr Ser Lys 
145 ISO 155 160 

Pro val Val Ala Lys Ser Ser Ser Ala Ser Lys Pro Ser Phe Val Pro 

165 170 175 

Lys ser Thr Gly Lys Pro Val Ala Pro Ala Lys Pro Lys Pro Lys Asn 
180 185 190 

He Thr Lys Asp Ala Gly Trp Gly Asp Ala Glu Asp Val Glu Glu Arg 
195 200 205 

Asp Phe Asp Lys Lys Pro Leu Asp Asn Val Pro Ser Ala Tyr Lys Pro 
210 215 220 

Thr Lys Val Asn He Asp Glu Leu Arg Lys Gin Lys Ser Asp Thr Thr 



225 



230 



235 



240 



Ser ser Thr Pro Lys Thr Phe Lys Ser Glu Pro Gin Glu Glu Lys Asn 
245 250 255 

Asp Asp Asp Gly Gin Ser Lys Pro Leu Ser Glu Arg Met Lys Ala Tyr 

260 265 270 

ASP cm Pro S«r Ser Ser Asp Gly Arg Leu Thr Ser Leu Pro Lys Pro 
275 280 285 

Lys He Gly His Ser Val Ala Asp Lys Tyr Lys Ala Ser Ala Ser Gly 
290 295 300 

Asn Gly Ala Ala Pro Ala Phe Gly Ala Lys Pro Ala Phe Gly Thr Gin 
305 310 315 320 

Ser val Asp Ser Arg Lys Asp Lys Leu Val Gly Gly Leu Ser Arg Asp 
325 330 

Phe Gly Ala Glu Asn Gly Lys Thr Pro Ala Gin He Trp Ala Glu Lys 
340 345 '^^^ 

Arg Gly Lys Tyr Lys Thr Val Ala Ser Asp Glu Lys Glu Thr Asn Ser 
355 360 365 

Ser Glu Lys Val Asp Glu Pro Glu Glu His His Ala Ala Asp Leu Ala 
370 375 380 

Lys Lys Phe Glu Glu Lys Ala Asn He Ala Gly Asp Thr Pro Ser Leu 
385 390 395 400 

Pro Thr Arg Asn Leu Pro Pro Ala Pro Pro Ala Arg Glu Thr Ala He 
405 410 4ia 
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Pro Ser Aan Glu Lys Aap Lys Xaa Glu Lya Glu Glu Glu Glu Gin Ala 
420 425 430 

Pro Ala Pro Ser Leu Pro Thr Arg Asn Leu Pro Pro Pro Ser Gin Arg 
435 440 445 

Gin Pro Glu Pro Glu Pro Glu Pro Glu Glu Glu Glu Glu Glu Glu Glu 
450 455 460 

Xaa Glu Ala Pro Ala Pro Ser Leu Pro Ala Arg Asn Leu Pro Pro Ala 
465 470 475 480 

Pro Lys Ala Glu Ala Glu Glu Ser Lys Lys Gin Ser Thr Thr Ala Thr 
485 490 495 

Ala Glu Tyr Asp Tyr Glu Lys Asp Glu Asp Asn Glu lie Gly Phe Ser 
500 505 510 

Glu Gly Asp Leu He He Asp He Glu Phe Val Asp Asp Asp Trp Trp 
*5 515 520 525 

Gin Gly Lys His Ala Lys Thr Gly Glu Val Gly Leu Phe Pro Ala Thr 
530 535 540 



10 



20 



25 



30 



35 



Tyr Val Ser Leu Asn Glu Lys Ala Ala Asp Lys Glu Glu Glu Ala Pro 
545 550 555 560 

Ala Pro Ala Pro Ala Pro Ser Leu Pro Ser Arg Glu Glu Thr Gin Ala 
565 570 575 

Ala Pro Ala Leu Pro Ser Arg Ser Glu Gin Lys Pro Glu Ser Lys Thr 
580 585 590 

Ala Thr Ala Glu Tyr Asp Tyr Glu Lys Asp Glu Asp Asn Glu lie Gly 
S95 €00 605 

Phe ser Glu Gly Asp Leu He Val Glu He Glu Phe Val Asp Asp Asp 
610 615 620 

Trp Trp Gin Gly Lys His Ser Lys Thr Gly Glu Val Gly Leu Phe Pro 
625 630 635 640 

Ala Asn Tyr Val Val Leu Asn Glu 
645 

<2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



45 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

ATCTGTGACG TCGTATTAGC ATCTCAATGG GGGGATGAAG GTAAAGGTAA ATTAGTCGAT 60 

TTATTATGTC ATGATATCGA TGTTTGTGCC AGGTGTCAAG GTGGTAACAA TGCTGGCCAC 120 

ACAATTGTTG TTGGTAAAGT CAAGTATGAC TTCCACATGT TACCTTCTGG TTTGGTCAAT 180 

50 

CCTAAATGTC AAAACTTAGT TGGATCTGGT GTTGTTATCC ACGTTCCTTC CTTCTTTGCT 240 

GAATTGGAAA ACTTGGAAGC AAAAGGGTTA GATTGTCGTG ATAGATTGTT TGTTTCATCT 300 
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AGAGCTCATT TGCTCTTTGA CTTCCATCAA CGTACTGATA AATTGAAAGA AGCTGAATTA 360 

TCAACCAATA AGAAATCAAT AGGTACTACC CGTAAACCTA TTGGTCCAAC XTACTCAACC 420 

AAGGCAACTA GATCACGTAT CAGAGTCCAC CATTTACTCA ACCCtGATCC AGAAGCTTGC 480 

GAAfiAATTCA AAACTAGAXA TTTGAGATTA GTCGAGACTA GACAAAAAAG ATACGGTGAA 540 

TTTGAATATC ATCCTAAGGA AGAATTGGCA AfiAITTGAAA AATACCCTCA AACCTTGAGA 600 

CCATTCGTCG TCGACTCCGT CAACTTCATG CACfiAAGCTA TTGCTGCCAA TAAAAAAATC 660 

TTGGrrGAAS CTGCTAATGC GTTAATGTTG GATATTGATT TCGGTACTTA TCCATACGTC 720 

ACTTCTTCAT CAACTGGTAT TGGTCGTGTT TTGACTGGGT TGGGTATTCC TCCAAGAACC 780 

ATCAGAAATG TCTATGGTGT TGTTAAAGCC TACACCACTA GACTTGGTGA GOGTCCATTC 840 

CCAACAGAAC AAtTGAACAA GGTAGGTGAA ACTTTGCAAO ATGTTCGTGC CGAATATGGT 900 

GTTACTACTG GAAfiAAAAAS AAGATGTGCT TOCTTGGATT TGGTTGTGTT GAAATATTCC 960 

AACCTGATCA ACGGAtACAC TTCTTTGAAC ATCACCAAAT TG6ATGTTTT GGATAAATTC 1020 

AAGGAAATTG AAGTTGGTGT TCCTTATAAA TtGAATGGAA AAGAGTTGCC AAGTTTCCCT 1080 

GAAGATTTGA TTGATTTACC TAAAGTCGAG GTTGTGTATA AGAAATTCCC AGGTTGGGAA 1140 

CAAGATATCA CCGGTATCAA GAAATATGAA GACTTGCCAG AAAACGCTAA GAACTATCTT 1200 

AAATTCATTC AAGATTACTT GCAAGTTCCA ATCCAATGGC TAGCTACCGG TCCAGCTAGA 1260 

GATTCTATGT TAGAAAAGAA GATTTAGTTG TACACATGCT ACGGAAGACG ATTAGATTTG 1320 

TTTTATTAGA TTAATAACCT 

(2) INFORKATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS 1 

(D) TOPOLOGY: unknovm 

(ii) MOLECULE TYPE: peptida 
(iii) HYPOTHETICAL: NO 



<xi) SEOUENCB DESCRIPTION; SEQ ID NO: 44: 

Met Cys Asp Val Val Leu Gly Sar Gin Trp Gly Aap Glu Gly Lys Gly 
1 S 10 15 

Ly3 Leu Val Aap Leu Leu Cys Aap Asp He Asp Val Cys Ala Arg Cys 
20 25 30 

Gin Gly Gly Asn Asn Ala Gly His Thr He Val Val Gly Lys Val Lys 
35 40 45 

Tyr Aap Phe Hia Met Leu Pro Ser Gly Leu Val Asn Pro Lys Cys GXn 
50 55 60 

Asn Leu Val Gly Ser Gly Val Val He His Val Pro Ser Phe Phe Ala 
€5 70 75 80 

Glu Leu Glu Asn Leu Glu Ala Lys Gly Leu Asp Cys Arg Asp Arg Leu 
85 90 95 

Phe Val Ser Ser Arg Ala His Leu Val Phe Asp Phe His Gin Arg Thr 
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100 105 110 

Asp Lya Leu Lys Glu Ala Glu teu Ser Thr Asn Lys Lys Ser He Gly 
115 120 12S 

Thr Thr Gly Lys Gly He Gly Pro Thr Tyr S«r Thr Lys Ala S«r Arg 
130 135 140 

Ser Gly lie Arg Val His His Leu Vel Asn Pro Asp Pro Glu Ala Trp 

145 ISO 155 160 

Glu Glu Phe Lys Thr Arg Tyr Leu Arg Leu Val Glu Ser Arg Gin Lys 
165 170 175 

Aro Tyr Gly Glu Phe Glu Tyr Asp Pro Lys Glu Glu Leu Ala Arg Phe 
180 18S 190 

Glu Lys Tyr Arg Glu Thr Leu Arg Pro Phe Val Val Asp Ser Val Asn 

195 200 205 

Phe Met His Glu Ala He Ala Ala Asn Lys Lys He Leu Val Glu Gly 
210 215 220 

Ala Asn Ala Leu Met Leu Asp He Asp Phe Gly Thr Tyr Pro Tyr Val 
225 230 235 240 

20 Thr Ser Ser Ser Thr Gly He Gly Gly Val Leu Thr Gly Leu Gly He 

245 250 255 

Pro pro Artf Thr He Arg Asn Val Tyr Gly Val Val Lys Ala Tyr Thr 
260 265 270 

Thr ATQ Val Gly Glu Gly Pro Phe Pro Thr Glu Gin Leu Asn Lys Val 
25 ' 275 280 285 

Gly Glu Thr Leu Gin Asp Val Gly Ala Glu Tyr Gly Val Thr Thr Gly 
290 295 300 

Arg Lys Arg Arg Cys Gly Trp Leu Asp Leu val Val Leu Lys Tyr Ser 
305 310 315 320 

Asn ser He Asn Gly Tyr Thr ser Leu Asn He Thr Lys Leu Asp Val 
325 330 335 

Leu Asp Lys Phe Lys Glu He Glu Val Gly Val Ala Tyr Lys Leu Asn 
340 345 350 

Gly Lys Glu Leu Pro Ser Phe Pro Glu Asp Leu He Asp Leu Ala Lys 
355 360 365 

Val Glu Val Val Tyr Lys Lys Phe Pro Gly Trp Glu Gin Asp He Thr 
370 375 380 

Gly He Lys Lys Tyr Glu Asp Leu Pro Glu Asn Ala Lys Asn Tyr Leu 

^ 385 390 395 400 

Lys Phe He Glu Asp Tyr Leu Gin Val Pro He Gin Trp Val Gly Thr 
405 410 415 



30 



35 



45 



Gly Pro Ala Arg Asp Ser Met Leu Glu Lys Lys He 
420 425 

(2) INFORMATIOH FOR SEQ ID NO: 45: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2481 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECin.E TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 





ATGACTGGTG 


AAOAAGATAA 


AAAACAACAT 


TTTGATGCTT 


CTGGTGCTTC TGCTGTAGAT 


60 




GATAAAACAG 


CAACTGCAAT 


TTTAAGAAGA 


AAAAAGAAAG 


ATAATGCCTT GGTCGTTGAT 


120 




GACGCCACCA 


ACGATGACAA 


TTCTGTCATA 


ACCATGTCGT 


CAAACACAAT CGAATTGTTA 


180 


10 


CAATTATTCC 


GTGGTGATAC 


ACTCTTGGTG 


AAAGGTAAGA 


AGAGAAAGGA CACAGTGTTG 


240 




ATCGTTTTAO 


CTGATGATGA 


TATGCCTGAT 


GGCGTTGCTA 


GAGTTAACAG ATGTGTTCGT 


300 




AACAATTTGC 


GTGTCAGATT 


GGGAGATATC 


GTTACTGTCC 


ATCCATGTCC TGATATTAAA 


360 


15 


TATGCCAACA 


GAATCTCAGT 


ATTGCCAATT 


GCTGATACTG 


TTGAAGGTAT TAATGGTTCC 


420 




TTATTCGACC 


TTTACTTGAA 


GCCATATTTT 


GTTGAAGCCT 


ATAGACCAGT GAGAAAAGGT 


480 




CSATTTATTCA 


CTGTGAGGGG 


TGCTATGAGA 


CAAGTAGAAT 


TCAAAGTTGT TGAAGTTGAC 


540 




CCTGAAGAAA 


TTGCAATTCT 


TGCTCAAGAT 


ACCATTATTC 


ATTCTGAAGG AGAACCTATT 


€00 


20 


AATCGTGAAG 


ATGAAGAAAA 


TAGC7TGAAT 


GAAGTGGGTT 


ACGACGATAT TGGAGGTTGT 


660 




AAGAAACAAA 


TGGCCCAAAT 


TAGAGAATTG 


GTTGAATTGC 


CTTTAAGACA TCCACAATTA 


720 




TTCAAATCGA 


TTGGTATTAA 


GCCACCAAAG 


GGTATTTTGA 


TGTATCGTCC ACCTGGTACC 


780 




GGTAAAACCA 


TTATGGCAAG 


AGCAGTGGCC 


AATGAAACAG 


GTGCCTTCTT TTTCTTAATA 


840 




AATGGTCCAC 


AAATTATGTC 


TAAAATGGCT 


GCTGAGTCTG 


AATCCAATTT AAGAAAAGCT 


900 




TTTGAAGAGG 


CTGAAAAGAA 


TTCTCCTTCC 


ATTATTTTCA 


TTGATGAGAT TGACTCTATT 


960 




GCCCCAAAGA 


GAGACAAAAC 


TAATGGTGAA 


CTAGAAAGAA 


GAGTTGTTTC TCAATTGTTA 


1020 


30 


ACCCTTATGG 


A7GGTATGAA 


GGCCAGATCT 


AATGTAGTTG 


TTATTGCTGC TACTAACAGA 


1080 




CCAAATTCTA 


TTGATCCTGC 


TTTGAGAAGA 


TTTGGAAGAT 


TCGACAGAGA AGTTGACATT 


1140 




GGTGTTCCGG 


ATGCTGAAGG 


ACGTTTAGAG 


ATTTTGAGAA 


TCCACACAAA GAATATGAAA 


1200 


35 


TTGGCTGATC 


ATGTTGACTT 


GGAAGCCATC 


GCTTCTGAAA 


CACATGGTTT CGTTGGTGCT 


1260 


GATATTGCTT 


CATTATGTTC 


AGAAGCTGCT 


ATGCAACAAA 


TCCGTGAAAA GATGGATCTT 


1320 




ATCGACTTGG 


AAGAAGAAAC 


CATTGATACT 


GAAGTGTTGA 


ACTCTTTGGO TGTCACTCAA 


1380 




GACAACTTCA 


GATTTGCTCT 


CGGAAACTCC 


AACCCATCTG 


CCTTGCGTGA AACTGTTGTT 


1440 


40 


GAAAATGTTA 


ATGTCACTTG 


GGATGATATT 


GGTGGTTTGG 


ACAACATTAA GAATGAATTA 


ISOO 




AAAGAAACCG 


TGGAGTATCC 


TGTTTTACAT 


CCAGATCAAT 


ACCAAAAATT CGGATTGGCA 


1560 




CCAACAAAAG 


GTGTTTTGTT 


CTTTGCTCCA 


CCAGGTACTG 


GTAAGACACT TTTGGCCAAG 


1620 


45 


GCTGTTGCTA 


CTGAACTTTC 


TGCTAATTTC 


ATTTCTGTCA 


AAGGTCCAGA ATTGTTGAGT 


1680 


ATGTGGTATG 


GTGAATCTGA 


GTCTAATATC 


CGTGATATAT 


TTGACAAGGC CAGAGCTGCT 


1740 




GCTCCTACTG 


TGGTGTTTTT 


GGATGAATTG 


GACTCCATTG 


CCAAAGCTAG AGGTGGTTCT 


1800 




CACGGTGATG 


CTGGTGGTGC 


CTCCGACAGA 


GTGGTCAATC 


AATTGTTGAC TGAAATGGAC 


1860 


50 


GGTATGAATG 


CTAAGAAGAA 


TGTCTTTGTC 


ATTGGTGCCA 


CTAACAGACC AGATCAAATT 


1920 




GATCCTGCAT 


TATTGAGACC 


AGGTAGATTG 


GATCAATTAA 


TTTATGTCCC ATTGCCAGAT 


1980 



55 
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GAGCCAGCTA GATTCTCTAT TTTACAAGCT CAATTGAC»A ACACTCCATT AGAACCTGGT 2040 
TTGGACTTGA ACGAAATTGC CAAGATCACT CACGCTTTCT CGGGTGCAGA TTTGTCTTAT 2100 
ATTGTTCAAA GATCTGCTAA ATTTGCTATT AAAGACTCTA TTGAACCCCA AGTAAAGATT 2160 
AACAAGATTA AAGAAGAAAA AGAAAAGGTG AAAACTGAAG ATGTTGATAT GAAGGTAGAT 2220 
GAAGTTGAAC AAGAAGACCC TGTGCCTTAC ATTACCAGAG CTCACTITGA AGAGGCTATG 2280 
AAGACCGCAA AMwSATCTGT tTCAGRCGCT GAATTACGTC CTTATGAGTC TTACGCTCAA 2340 
10 CAATTCCAAC CCTCAAGRGG TCAATTTTCT ACCTTTAGAT TCAATGAAAA TGCTGCTGCC 2400 

ACTGATAAIC GTTCAGCAGC AGCTGCCAAC TCAGGTOCAG CTTTCGGAAA CGTTGAAGAG 2460 
GAAGACGATT TGTACAGTTG A 
{2) INrORMATlON POR SEQ ID NO: 46: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 826 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS: 
(0) TOPOLOGY: un)cnown 

2^ (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: MO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Thr Gly Glu Glu Asp Lya Lys Gin His Phe Asp Ala Ser Gly Ala 
1 5 10 15 

Ser Ala Val Asp Asp Lys Thr Ala Thr Ala lie Leu Arg Arg Lya Lys 
20 25 30 

Lys Asp Asn Ala Leu Val Val Asp Asp Ala Thr Asn Asp Asp Asn Ser 
35 40 45 

Val lie Thr Met Ser Ser Asn Thr Met Glu Leu Leu Gin Leu Phe Arg 
50 55 60 

Gly Asp Thr Val Leu Val Lys Gly Lys Lys Arg Lys Asp Thr Val Leu 
6S '^^ 

He Val Leu Ala Asp Asp Asp Met Pro Asp Gly Val Ala Arg Val Asn 
85 90 

Arg Cys Val Arg Asn Asn Leu Arg Val Arg Leu Gly Asp lie Val Thr 
100 105 

val His Pro Cys Pro Asp He Lys Tyr Ala Asn Arg lie Ser Val Leu 
X15 120 125 

Pro He Ala Asp Thr Val Glu Gly He Asn Gly Ser Leu Phe Asp Leu 
45 130 135 140 

Tyr Leu Lys Pro Tyr Phe Val Glu Ala Tyr Arg Pro Val Arg Lys Gly 
145 150 155 160 



Asp Leu Phe Thr Val Arg Gly Gly Met Arg Gin Val Glu Phe Lys Val 
165 170 1'5 

Val Glu Val Asp Pro Glu Glu lie Ala He Val Ala Gin Asp Thr He 
180 105 190 
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He His cya Clu Gly Glu Pro He Aan Arg Glu Aap Glu Glu Asn ser 
195 200 205 

teu Aan Glu Val Gly Tyr Asp Aap He Gly Gly cys tys Lys Gin Met 
210 215 220 

hi A Gin He Arg Clu Leu Val Glu Leu Pro Leu Arg His Pro Gin Leu 
225 230 235 240 

Phe Lys ser He Gly He Lys Pro Pro Lys Gly He Leu Met Tyr Gly 
245 250 255 

Pro Pro Gly Thr Gly Lys Thr He Met Ala Arg Ala Val Ala Asn Glu 

260 265 270 

Thr Gly Ala Phe Phe Phe Leu He Asn Gly Pro Glu He Met ser Lys 
275 280 295 

Met Ala Gly Glu Ser Clu Ser Asn Leu Arg Lys Ala Phe Glu Glu Ala 
290 295 300 

Glu Lys Asn Ser Pro Ser He He Phe He Asp Glu He Asp ser He 
305 310 315 320 

Ala pro Lys Arg Asp Lys Thr Asn Gly Glu Val Glu Arg Arg val Val 
325 330 335 

Ser Gin Leu Leu Thr Leu Met Asp Gly Met Lys Ala Arg Ser Asn Val 
340 345 350 

val Val He Ala Ala Thr Asn Arg Pro Asn Ser He Asp Pro Ala Leu 

355 360 365 

Arg Arg Phe Gly Arg Phe Asp Arg Glu Val Aap He Gly Val Pro Asp 
370 375 380 

Ala Glu Gly Arg Leu Glu He Leu Arg He His Thr Lys Asn Met Lys 
385 390 395 400 

Leu Ala Asp Asp Val Asp Leu Glu Ala He Ala Ser Glu Thr His Gly 
405 410 415 

Phe Val Gly Ala Asp He Ala Ser Leu Cys Ser Glu Ala Ala Met Gin 
420 425 430 

Gin He Arg Glu Lys Met Asp Leu He Asp Leu Glu Glu Glu Thr He 
435 440 445 

Asp Thr Glu Val Leu Asn Ser Leu Gly Val Thr Gin Asp Asn Phe Arg 
450 455 460 

Phe Ala Leu Gly Asn Ser Asn Pro Ser Ala Leu Arg Glu Thr Val Val 
465 470 475 480 

Glu Asn Val Asn Val Thr Trp Asp Asp He Gly Gly Leu Asp Asn He 
485 490 495 

Lys Aan Glu Leu Lys Glu Thr Val Glu Tyr Pro Val Leu His Pro Asp 
500 505 510 

Gin Tyr Gin Lys Phe Gly Leu Ala Pro Thr Lys Gly Val Leu Phe Phe 
515 520 525 

Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala val Ala Thr 
530 535 540 

Glu Val ser Ala Asn Phe He Ser Val Lys Gly Pro Glu Leu Leu Ser 
545 550 555 560 

Met Trp Tyr Gly Glu Ser Clu Ser Asn He Arg Asp He Phe Asp Lys 
565 570 575 
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Ala Aro Ala Ala Ala Pro Thr Val Val Phe Leu Asp Glu Leu Asp Ser 
580 585 590 

lie Ala Lys Ala Arg Cly Cly S«r His Gly Asp Ala Cly Cly Ala Ser 
595 600 60S 

5 

Asp Arg Val Val Asn Gin Leu Leu Thr Glu Met Asp Gly Met Asn Ala 
610 615 620 

Lys Lys Asn Val Phe Val lie Gly Ala Thr Asn Axq Pro Asp Gin lie 
625 €30 63S 640 

10 Asp Pro Ala Leu Leu Arg Pro Gly Arg Leu Asp Gin Leu He Tyr Val 

645 650 655 

Pro Leu Pro Asp Glu Pro Ala Arg Leu Ser tie Leu Gin Ala Gin Leu 
660 665 670 

Arg Asn Thr Pro Leu Glu Pro Gly Leu Asp Leu Asn Glu He Ala Lys 
675 680 685 



15 



20 



25 



30 



35 



He Thr His Gly Phe Ser Gly Ala Asp Leu Ser Tyr He Val Gin Arg 
690 695 700 

Ser Ala Lys Phe Ala He Lys Asp Ser He Glu Ala Gin Val Lys He 

705 710 715 720 

Asn Lys He Lys Glu Glu Lys Glu Lys Val Lyj Thr Glu Asp Val Asp 
725 730 735 

Met Lys Val Asp Glu Val Glu Glu Glu Asp Pro Val Pro Tyr He Thr 
740 745 750 

Arg Ala His Phe Glu Glu Ala Met Lys Thr Ala Lys Arg Ser Val Ser 
755 760 765 

Asp Ala Glu Leu Arg Arg Tyr Glu Ser Tyr Ala Gin Gin Leu Gin Ala 
770 775 780 

Ser Arg Gly Gin Phe Ser Ser Phe Arg Phe Asn Glu Asn Ala Gly Ala 
785 790 795 800 

Thr Asp Asn Gly Ser Ala Ala Gly Ala Asn Ser Gly Ala Ala Phe Gly 
805 010 815 

Asn Val Glu Glu Glu Asp Asp Leu Tyr Ser 
820 825 

(2) INFORMATIOK FOR SEQ ID NO: 47: 



(i) SBQUEKCB CHARACTERISTICS: 

(A) LENGTH: 1918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 

45 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TTTTTTTTTC TCCCTCTCTC TCGTTCAGAT TCTGTAGAAT TGATTGGTTG AGAGTAAAAG 60 
TCAGACTTTT TTTTTTGCTC TCCATCTAGT GGGACAAATA AGAAGTTTAA CAAAGAACGA 120 
^ CAAAAAATCC TCACCAGAAG AAAAAAAAAT CAATTTTCAC AGGTAAAGTT GTACGGACAG 160 

CACGACAGAC ACAAAACTAA ACTAAATCCA TGAGGAAAAA AGTAAAAAAA AAAAAATTGT 240 
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•prACCACAAC 


TTCAAGAGCC 


ATTAAAACCA 


AAAATTTGGA 


ATATAAATTT 


CAACTGATTT 


300 




CTTGCTGGAT 


TTTTTTGTAT 


ATATTTGCAA 


TTGATTTCCT 


TTTACTTTTT 


TTTTTTCCAT 


360 


5 




CCTITTTCCA 


TCTTTTAAGT 


TTCTTTTAGA 


ATATAGTATA 


TTTATCAAAC 


420 




AATGTCTGCA 


TTCAGATCAA 


TTCAACCTTC 


AACCAACGTA 


GCCAAGAGCA 


CTTTCAAAAA 


480 




CAGCATCAGA. 


ACATATGCTT 


CTGCTGAACC 


AGTATGTATT 


CACTTTTTTG 


AGGATCCGGG 


540 




CAATGTGCTT 


GGGATTTTAC 


TTTTAACGTA 


TATACAAAGA 


TAATTTACTA 


ACTTGCTTTC 


600 


10 


TTAGACCTTA 


AAACAAAGAT 


TGGAAGAAAT 


CTTGCCAfiCC 


AAAGCTGAA6 


AAGTTAAACA 


660 




ATTCAAAAAA 


GAACACGGTA 


AAACTGTCAT 


TGGTGAAGTT 


TTATTAGAAC 


AAGCTTACGG 


720 




TGGTATGAGA 


GG7ATCAAAG 


GTTTAGTTTG 


GGAAGGTTCT 


GTTTTGGACC 


CAATTGAAGG 


780 


15 


TATCCGTTTC 


AGAGGAAGAA 


CCATCCCAGA 


CATTCAAAAA 


GAATTGCCAA 


AAGCACCAGG 


840 




TGGTGAAGAA 


CCATTACCAG 


AAGCTCTTTT 


CTGGTTCTTG 


TTGACTCGTG 


AAGTTCCAAC 


900 




TGACGCCCAA 


ACTAAGGCTT 


TATCCGAAGA 


ATTTGCTGCT 


AGATCAGCAT 


TACCAAAGCA 


960 


20 


CGTTGAAGAA 


TTGATCGACA 


GATCTCCATC 


TCACTTGCAC 


CCAATGGCTC 


AATTCTCCAT 


1020 




TGCCGTTACT 


GCTTTGGAAT 


CTGAATCCCA 


ATTTGCCCAA 


GCTTATGCTA 


AAGGTGCCAA 


1080 




CAAATCCGAA 


TACTGGAAAT 


ACACTTACGA 


AGATTCCATC 


GATTTGTTAG 


CTAAATTGCC 


1X40 


25 


AACCATTGCT 


GCTAAGA7TT 


ACAGAAACGT 


TTTCCACGAT 


GGTAAATTGC 


CAGCTGCCAT 


1200 


TGACTCCAAA 


TTGGATTACG 


GTGCTAACTT 


GGCCAGTTTG 


TTAGGTTTTG 


GTGACAACAA 


1260 




GGAATTTGTT 


GAATTAATGA 


GATTGTACCT 


TACCATCCAC 


TCTGACCACG 


AAGGTGGTAA 


1320 




CGTCTCTGCA 


CACACCACCC 


ACTTGGTTGG 


TTCCGCTTTA 


TCTTCCCCAT 


TCTTGTCATT 


1380 


30 


AGCTGCTGGT 


TTGAATGGTT 


TAGCTGGTCC 


ATTACACGGT 


AGAGCTAACC 


AAGAAGTTTT 


1440 




GGAATGGTTG 


TTCAAATTAA 


GAGAAGAATT 


AAACGGTGAC 


TACTCCAAGG 


AAGCCATTGA 


1500 




AAAATACTTG 


TGGGAAACCT 


TGAACTCCGG 


TAGAGTTGTC 


CCAGGTTACG 


GTCACGCTGT 


1560 


35 






GATACACTGC 


TCAAAGAGAA 


TTTCCTCTTA 


AACATATGCC 


1620 




AGACTACGAA 


TTGTTCAAAT 


TGGTTTCAAA 


CATTTACGAA 


GTCGCTCCAG 


GTGTTTTAAC 


1680 




CAAACACGGT 


AAGACCAAGA 


ACCCATGGCC 


AAATGTGGAC 


TCCCACTCTG 


GTGTCTTGTT 


1740 


40 


ACAATACTAC 


GGTTTGACTG 


AACAATCTTT 


CTACACTGTC 


TTGTTCGGTG 


TTTCCAGAGC 


1800 




CTTTGGTGTC 


TTGCCACAAT 


TGATCTTGGA 


CCGTGGTATC 


GGTATGCCAA 


TTGAAAGACC 


1860 




AAAATCTTTC 


TCCACTGAAA 


AATACATTGA 


ATTGGTCAAA 


AACATCAACA 


AAGCTTAA 


1918 



(2) INFORMATION FOR SEQ ID NO: 48: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 466 amino acids 

(B) TYPE: aaiino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

50 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Met ser Ala Ph« Arg Ser He Gin Arg Ser Thr Aan Val Ala Lys Ser 
15 10 is 

Thr Phe Lys Asn Ser Zle Arg Thr Tyr Ala Ser Ala Glu Pro Thr Leu 
20 25 30 

Lys Gin Arg Leu Glu Glu He Leu Pro Ala Lys Ala Glu Glu Val Lys 
35 40 45 

Gin Phe Lys Lys Glu His Gly Lys Thr Val Zle Gly Glu Val Leu Leu 
50 55 60 

Glu Gin Ala Tyr Gly Gly Met Arg Gly lie Lys Gly Leu Val Trp Glu 
65 70 75 80 

Gly Ser Val Leu Asp Pro Xle Glu Gly Zle Arg Phe Arg Gly Arg Thr 
85 90 95 

lie Pro Asp He Gin Lys Glu Leu Pro Lys Ala Pro Gly Gly Glu Glu 
100 105 110 

Pro Leu Pro Glu Ala Leu Phe Trp Leu Leu Leu Thr Gly Glu Val Pro 
lis 120 125 

Thr Asp Ala Gin Thr Lys Ala Leu S<r Glu Glu Phe Ala Ala Arg Ser 
130 135 140 

Ala Leu Pro Lys His Val Glu Glu Leu lie Asp Arg Ser Pro Ser His 
145 150 155 160 

Leu His Pro Met Ala Gin Phe Ser He Ala Val Thr Ala Leu Glu Ser 

165 170 175 

Glu Ser Gin Phe Ala Gin Ala Tyr Ala Lys Gly Ala Asn Lys Scr Glu 
180 185 190 

Tyr Trp Lys Tyr Thr Tyr Glu Asp Ser He Asp Leu Leu Ala Lys Leu 
195 200 205 

Pro Thr He Ala Ala Lys He Tyr Arg Asn Val Phe His Asp Gly Lys 
210 215 220 

Leu Pro Ala Ala Zle Asp Ser Lys Leu Asp Tyr Gly Ala Asn Leu Ala 

225 230 235 240 

Ser Leu Leu Gly Phe Gly Asp Asn Lys Glu Phc Val Glu Leu Met Arg 
245 250 255 

Leu Tyr Leu Thr He His Ser Asp His Glu Gly Gly Asn Val Ser Ala 
260 265 270 

His Thr Thr His Leu Val Gly Ser Ala Leu Ser Ser Pro Phe Leu Ser 
275 280 285 

Leu Ala Ala Gly Leu Asn Gly Leu Ala Gly Pro Leu His Gly Arg Ala 
290 295 300 

Asn Gin Glu Val Leu Glu Trp Leu Phe Lys Leu Arg Glu Glu Leu Asn 
305 310 315 320 

Gly Asp Tyr Ser Lys Glu Ala He Glu Lys Tyr Leu Trp Glu Thr Leu 
325 330 335 

Asn Ser Gly Arg Val Val Pro Gly Tyr Gly His Ala Val Leu Arg Lys 
340 345 350 

Thr Asp Pro Arg Tyr Thr Ala Gin Arg Glu Phe Ala Leu Lys His Met 
355 360 365 

Pro Asp Tyr Glu Leu Phe Lys Leu Val Ser Asn He Tyr Glu Val Ala 
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370 375 380 

Pro Gly val t€u Thr Lys His Gly Lys Thr tys Aan Pro Trp Pro Aan 
385 390 395 400 

Val Asp ser His Ser Gly Val Leu Lau Cln Tyr Tyr Gly Lmu Thr Clu 
40S 4X0 415 

Gin Sar Phe Tyr Thr Val Lau Pha Gly Val Sar Arg Ala Phe Gly Val 
420 425 430 

Lau Pro Gin Ltu He Leu Asp Arg Gly He Gly Met Pro He Glu Arg 
435 440 445 

Pro Lys Ser Phe Ser Thr Glu Lys Tyr lie Glu Leu Val Lys Aan He 
450 45S 460 

Asn Lys 
465 

(2) INFORMATIOW FOR SEQ ID HO: 49: 

(i) SEOUBMCE CHARACTERISTICS: 

(A) LENGTH: 678 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



30 



35 



TTTTCTGATT ATCATGTTAT 


TTGGTTAGCT AAACGGAATA ATGCGATAAT 


GGAAGCTGAA 


60 


TATCGATTAT ATTTATTACT 


TATCACTTTA ATCATTTCAC CCGTAGGGTT 


AATTATGTTT 


120 


GGTGTTGGTG CCGCTAGAGA 


ATGGCCATGG CAA6TGATTT ATGTTGGATT 


AGGTTTCATT 


180 


GGGTTTGGTT GGGGATCAAT 


TGGTGATACT TCAATGTCTT ATTTAATGGA 


TGCTTATCCT 


240 


GATATTGTCA TTCAAGGAAT 


GGTGGGAGTA AGTATTATTA ATAATACTTT 


GGCTTGTATT 


300 


TTCACTTTTG CTTGTTCTTA 


TTGGTTAGAT GGATCAGGAA CACAAAACAC 


ATATATTGCC 


360 


TTGTCAATTA TTGATTTTGC 


TACCATAGCA TTGGTTTTCC CCTTTTTATA 


TTATGGTAAA 


420 


ACATTTAGAA GGAAAACTAA 


AAGACTTTAT GTTTCAATGG TTGAATTGAC 


TCAAGGGATG 


460 


CGATAAGAGA GTGAGTGCTA 


AAAGAATTTT ATTAATGATA CATTTATTAT 


TAGAATTACT 


540 


ACTATGGAAA TCCGAGTCTC 


TGTTTTTTTT AGAAGTATAT TTTAGACGTA 


TTTAGAGTTG 


600 


TTTTTCTCCT TTGTACTTTA 


TTTAGCATTT TATAATATAT TAATTCAAGT 


TGCATTAATA 


660 


TATATAAATA AAAAAACT 






678 



(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 159 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



.59- 
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ATCGAATATA CSAGAGCTTTC TTCAGTTTAA TGAATTTATT TATTACTGCT TTTACTCAAT 240 

GGAGGAATAT AATTATGAAT TGGTTGATGA TTTGATAAAA TTTATAACTA TAAATATGAA 300 

TTCTCATGGC AGAATAGTTA ATTTTGGCAC TAATGTTAAA ATTAATAAAT TACACGAATT 360 

AATTAAGAAT TTGATTGATA AAGTTAATAA AAACAAACAA AATCTGACTA GCAACAACAA 420 

AAACAACAAC AftCAACAACA GCAACAACAA CAGCAACAGC AACAATTCCC AACATATTGT 480 

TTTGATACCT AATGCCAACT CTTCCAATTT CCCATGGGAA TCGATCGAAT TTCTTCGTAG 540 

10 TAAATCAATT TCAAGAATGC CATCAATTCA TATCTTACTT GATCTAGTCA AATCAAACAC 600 

CAATAACAAC AACAACTTAA TGTTTGTTGA TAAATCTAAT TTGTATTATT TGftTTAATCC 660 

CAGTCGTGAT TTAATTCGAT CAGAAAATCG ATTCAAAAAA CTATTTGAAT CAAATCATTT 720 

ATGGAGAGGG GAAATTGGAA AATTATCAAG TAATGAACAT GAAGATTATC AAGATTCAAT 780 

ATTATGTGAA ATCTTGAAAA GTCATTTATT TGTTTATATT GGTCATGGTG GTTGTGATCA 840 

ATATATTAAA GTATCAAAAT TATTTAAAAA ATGTCCCAAT AATCAAGATT TACTGAATAA 900 

ATTACCTCCT AGTTTATTGT TAGGTTGTTC ATCAGTTAAA TTAGATAATT GTAATTATAA 960 

20 CTATAATTCC AGTATCTTAC AACCACTGGG TAATATTTAT AATTGGTTGA ACTGTAAATC 1020 

GTCAATGATA CTCGGGAATC TATGGGATGT TACTGATAAR GAYATTGATA TTTTTACACT 1080 

TTCATTACTA CAAAAATGGG GGTTAATAGA TGATTATAAT GGYAGTGGCC ATGATTATGG 1140 

TATGAAGAAA TTGGATTTGA CTAATTGTGT TGTTCAAAGT CGAAGTAAAT GTACTTTGAA 1200 

25 

ATACTTGAAT GGATCAGCAC CTGTGGTTTA TGGTCTACCA ATGTATTTAA AATAGACATT 1260 

CTGTTTGCAT ATAAGTTTAT ATATTTTAAT AATAAGAAAA AGAGCATAAT TTGGATCTTG 1320 

ATTTTGTATT GTTTGGTTTG TTATGAACAA ATTTTGCACC CAATCACTAT CGAACTTTCT 1380 

30 TTTTTAAACA GAGAACATTT AATCAACATT TAIGTTACAT TTAAGCGTTT AAATACATAT 1440 

TTGTCTTAGA TAGTTATATA ATGTTTGATG CAAACATACA ^^^^ 

(2) INFORMATION FOR SBQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 417 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

45 ;^,p ph« Gin L«u Gin Asp He Leu His His Val Glu Ser Lys Trp 

J 5 10 15 

Phe Gly Gly Phe He Ser Gly He Phe Thr Asn Asp Asn Asp Val Glu 
20 25 30 

Asn Glu Ser Lys Asn Val Phe His Lys Phe Lys Gin Asp Leu Met Lys 
50 35 4 0 45 

He Leu Lys Asp Cys Leu Thr Val Ser Asp Asp Lys Ser Asn He Glu 

55 
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50 55 60 

Arg Phe Leu Gin Phe A5n Glu Phe lit Tyr Tyr Cys Phe Tyr Sec Met 
65 70 75 80 

GIu Glu Tyr Asm Tyr Glu Leu Val Asp Asp Leu lie Lys Phe He Thr 
8S 90 95 

Zle Asn Met Asn Ser Hia Gly Arg He V«l Asn Phe Cly Thr Aan Vel 
100 105 110 

Lys He Asn Lys Leu His Glu Leu He Lys Asn Leu He Asp Lys Val 
lis 120 125 

Asn Lys Asn Lys Gin Asn Val Thr Ser Asn Asn Lys Asn Asn Asn Asn 
130 135 140 

Asn Asn Ser Asn Asn Asn Ser Asn Ser Asn Asn Ser Gin His He Val 
145 150 155 160 

Leu He Pro Aan Ala Asn Cys Ser Asn Phe Pro Trp Glu Ser Met Glu 

165 170 175 

Phe Leu Arg Ser Lys Ser He Ser Arg Met Pro Ser He His Met Leu 
180 185 190 

Leu Asp Leu Val Lys Ser Asn Thr Aan Asn Lys Asn Lys Leu Met Phe 
195 200 205 

Val Asp Lys Ser Asn Leu Tyr Tyr Leu He Asn Pro Ser Gly Asp Leu 
210 215 220 

He Arg Ser Glu Asn Arg Phe Lys Lys Leu Phe Glu Ser Aan His Leu 
225 230 235 240 

Trp Arg Gly Glu He Gly Lys Leu Ser Ser Asn Glu His Glu Asp Tyr 
245 250 255 

Gin Asp Ser He Leu Cys Glu He Leu Lys Ser His Leu Phe Val Tyr 
260 265 270 

He Gly His Gly Gly Cys Asp Gin Tyr He Lys Val Ser Lys Leu Phe 
275 280 285 

Lys Lys Cys Gly Asn Asn Gin Asp Leu Ser Asn Lys Leu Pro Pro Ser 
290 295 300 

Leu Leu Leu Gly Cys Ser Ser Val Lys Leu Asp Asn Cys Asn Tyr Asn 
305 310 315 320 

Tyr Asn Ser Ser Met Leu Gin Pro Ser Cly Asn He Tyr Asn Trp Leu 
325 330 335 

Asn Cys Lys Ser Ser Met He Leu Gly Asn Leu Trp Asp Val Thr Asp 
340 345 350 

Xaa Xaa He Asp He Phe Thr Leu Ser Leu Leu Gin Lys Trp Gly Leu 
355 360 365 

He Asp Asp Tyr Asn Xaa Ser Gly His Asp Tyr Gly Met Lys Lys Leu 
370 375 380 

Aap Leu Thr Asn Cys Val Val Gin Ser Arg Ser Lys Cys Thr Leu Lya 

385 390 395 400 

Tyr Leu Asn Gly Ser Ala Pro Val Val Tyr Gly Leu Pro Met Tyr Leu 
405 410 415 

Lys 

INFORMATION FOR SEQ 10 NO; 53: 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(i) SEQUENCE CKARACTBRZSTICS: 

(A) LENGTH: 1443 ba5« pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: aingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION; SBQ ID NO: 53: 




CTTCTTTTAG 


AGACAATGCA 


GTGGTTTTCT 


TACCAGATGC ATGACCCCCA CCCAATAAAA 


60 


CTATAATCGA 


TCTATTCACA 


GTATTTGATG 


CCATTTTGAT GGTGATGAAT GATGTGATGT 


120 


GATGCTCATC 


TTATTGGGAG 


TTTCAAAAAA 


AAAAGTTACA CTCGAAAAAA AAAAAATAGC 


180 


ATTATAAATA 


GAAGCTTTAC 


TATCTTATAG 


AACAAAACAA AAAACACTAT CTTCTAATTA 


240 


ATAATGGATG 


ATTTTGATAG 


AGATTTAGAT 


AATGAGTTGG AATTTAGTCA TAAATCAACG 


300 


AAAGGAATAA 


AGGTTCATCG 


CACTTTTGAA 


ACTATGAATT TGAAACCTGA TCTTTTGAAA 


360 


GGAATATATG 


CCTATGGATT 


TGAAGCACCA 


TCTGCTATTC AATCTAGGGC TATTATGCAG 


420 


ATCATCAGTC 


GTAGAGACAC 


AATAGCACAG 


GCACAATCTG GAACTGGTAA AACTGCTACT 


480 


TTTTCTATTG 


GTATGCTTGA 


GGTTATAGAT 


ACTAAATCAA AAGAGTGTCA AGCACTTATC 


540 


TTGTCTCCTA 


CTAGAGAGTT 


GGCAATTCAA 


ATACAAAATG TGGTCATGCA TTTAGGAGAT 


600 


TATATGAACA 


TTCACACCCA 


TGCCTGTATT 


GCTGGGAAAA ATGTCGGTGA GGATGTTAAG 


660 


AAATTGCAGC 


AAGGGCAACA 


AATAGTTAGT 


GGGACACCAG GTAGA6TGAT TGATGTGATA 


720 


AAAAGAAGAA 


ATCTACAAAC 


TAGAAATATC 


AAGGTTCTTA TTTTAGATGA AGCTGATGAA 


780 


CTTTTTACAA 


AAGGGTTTAA 


AGAACAGATC 


TACGAAATCT ACAAACATTT ACCACCTTCG 


840 


GTTCAAGTAG 


TAGTTCTTAG 


TGCCACTTTG 


CCACGTGAAG TATTGGAGAT GACAAGTAAG 


900 


TTTACCACTG 


ATCCAGTGAA 


AATCTTGGTG 


AAGAGGGATG AGATTTCGCT TCTGGGAATC 


960 


AAACAATATT 


ATGTTCAATC 


TGAACGTGAA 


GATTCGAAGT TTGATACACT ATGTGATTTG 


1020 


TATGACAACC 


TTACAATAAC 


TCAAGCAGTG 


ATATTTTGTA ATACCAAATT GAAGGTGAAT 


1080 


TGGCTTGCTG 


ATCAAATGAA 


AAAGCAAAAC 


TTTACTGTTG TGGCAATGCA TGGTGATATG 


1140 


AAACAAGATG 


AACGAGATTC 


AATTATGAAC 


GATTTTAGAA GGGGGAATTC AAGAGTATTA 


1200 


ATATCTACAG 


ATGTTTGGGC 


AAGAGGTATT 


GATGTCCAAC AAGTCTCGTT GGTAATAAAT 


1260 


TATGATTTGC 


CCACCCATAA 


GGAAAACTAT 


ATTCATAGAA TTGGACGATC AGGTAGATTT 


1320 


GGTAGAAAGG 


GAACAGCTAT 


AAACTTGATA 


ACTAAAGATG ATGTGGTCAC TTTAAAAGAA 


1360 


TTGGAGAAAT 


ATTATTCAAC 


GAAAATTAAG 


GAAATGCCAA TGAATATTAA TGATATAATG 


1440 


TAA 








1443 


(2) INPORMATION FOR SEO ID NO: 54: 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 399 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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{D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: HO 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 54: 

Met Aap Asp Phe Asp Arg Asp Leu Asp Asn Glu Leu Glu Phe Ser His 
15 10 15 

Lvs Ser Thr Lya Gly lie Lys V«l His Arg Thr Phe Glu Ser Met Asn 
20 25 30 

Leu Lys Pro Asp Leu Leu Lya Gly He Tyr Ala Tyr Gly Phe Glu Ala 
35 40 45 

Pro Ser Ala He Gin Ser Arg Ala He Met Gin He He Ser Gly Arg 
50 55 60 

Asp Thr He Ala Gin Ala Gin Ser Gly Thr Gly Lys Thr Ala Thr Phe 
65 70 75 80 

Ser He Gly Met Leu Glu Val He Asp Thr Lys Ser Lys Glu Cys Gin 
85 90 95 

Ala Leu He Leu Ser Pro Thr Arg Glu Leu Ala He Gin He Gin Asn 
100 105 HO 

Val Val Met His Leu Gly Asp Tyr Met Asn He His Thr His Ala Cys 
115 120 125 

He Gly Gly Lys Asn Val Gly Glu Asp Val Lys Lys Leu Gin Gin Gly 
130 135 140 

Gin Gin He Val Ser Gly Thr Pro Gly Arg Val He Asp Val He Lys 
145 150 155 160 

Aro Arg Asn Leu Gin Thr Arg Asn He Lys Val Leu He Leu Asp Glu 

^ 165 110 175 

Ala Asp Glu Leu Phe Thr Lys Gly Phe Lys Glu Gin He Tyr Glu He 
180 185 190 

Tyr Lys His Leu Pro Pro Ser Val Gin Val Val Val Val Ser Ala Thr 
195 200 205 

Leu Pro Arg Glu Val Leu Glu Met Thr Ser Lys Phe Thr Thr Asp Pro 
210 215 . 220 

Val Lys He Leu Val Lys Arg Asp Glu He Ser Leu Ser Gly He Lys 
225 230 235 240 

Gin Tyr Tyr Val Gin Cys Glu Arg Glu Asp Trp Lys Phe Asp Thr Leu 
245 250 255 

Cys Asp Leu Tyr Asp Asn Leu Thr He Thr Gin Ala Val He Phe Cys 
260 265 270 

Asn Thr Lys Leu Lys Val Asn Trp Leu Ala Asp Gin Met Lys Lys Gin 
275 280 285 

Asn Phe Thr Val Val Ala Met His Gly Asp Met Lys Gin Asp Glu Arg 
290 295 300 

Asp Ser He Met Asn Asp Phe Arg Arg Gly Asn Ser Arg Val Leu He 

305 310 315 320 

Ser Thr Asp Val Trp Ala Arg Gly He Asp Val Gin Gin Val Ser Leu 
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325 330 335 

Val lie Aan Tyr Asp Leu Pro Thr Asp Lys GXu Aan Tyr He Hia Arg 
340 345 350 

5 lie Gly Arg Ser Gly Arg Phc Cly Arg Lys Gly Thr Ala He Asn Leu 

355 360 365 

He Thr Lys Asp Asp Val Val The Leu Lys Glu Lou Clu Lys Tyr Tyr 

370 375 380 

Ser Thr Lys He Lys Glu Met Pro Met Asn He Asn Asp He Met 
10 385 390 395 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 
f5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



30 



40 



AACGTTGGCC 


TGGCCCAGTT 


AATTCCGTTT 


CCAAGCAAAT 


GAATGTCGAT 


ACCGACATCA 


60 


TCACGTTGAC 


CCGTTTTATT 


TTACAAGAAC 


AGCAAACTGT 


TGCTCCCACC 


GCCACCGGTG 


120 


AGTTGTCGTT 


GTTGTTGAAT 


GCGCTTCAAT 


TTGCATTCAA 


GTTTATTGCC 


CACAATATCA 


180 


GAAGAGCTGA 


GTTGGTCAAC 


CTTATTGGTG 


TTTCTGGCTC 


TGCCAACTCT 


ACCGGTGATG 


240 


TTCAGAAGAA 


ATTGGATGTG 


ATTGGTGATG 


AGATCTTTAT 


CAATGCCATG 


AGATCTTCCA 


300 


ACAACGTCAA 


GGTTTTGGTT 


TCTGAAGAGC 


AAGAAGACCT 


TATTGTGTTC 


CCAGGTGGTG 


360 


GCACATATGC 


TGTTTGTACT 


GATCCAATTG 


ATGGGTCGTC 


CAATATCGAT 


GCTGGTGTTT 


420 


CTGTTGGTAC 


GATTTTTGGT 


GTGTACAAGT 


TGCAAGAGGG 


GTCTACTGGT 


GGCATCAGCG 


480 


ATGTCTTGCG 


TCCTGCTAAC 


GAGATGGTCG 


CTGCGGGGTA 


CACCATGTAC 


GGTGCATCTG 


540 


CCCATTTGGC 


ATTGACTACA 


GGTCACGGTG 


TCAATCTTTT 


TACTTTCGAT 


ACTCAGTTGG 


600 


GTGAATTTAT 


CTTGACCCAT 


CCAAACTTGA 


AGTTGCCAGA 


TACTAAGAAC 


ATCTACTCGT 


660 


TGAATGAAGG 


GTACTCGAAC 


AAATTCCCAG 


AATACGTTCA 


AGATTATCTG 


AAGGACATTA 


720 


AAAAGGAAGG 


GTACAGTTTG 


AGATACATTG 


GACTGATGGT 


TGCTGATGTC 


CATCGTACTC 


780 


TTTTGTATGG 


TGGTATTTTT 


GCTTACCCTA 


CATTAAAGTT 


GAGAGTGTTG 


TATGAATGTT 


840 


TCCCCATGGC 


CTTGTTCATG 


GAACAACCAG 


GCGGTTCTGC 


TGTCACCATC 


AAGGGTGAGA 


900 


CGATCTTGGA 


TATCTTGCCA 


AAAGGTATAC 


ACGACAAGAG 


TTCTATTGTG 


TTGGGATCCA 


960 


AGGGTGAAGT 


TGAAAAGTAT 


TTAAAGCATG 


TACCAAAATA 


GATTATGTAG 


AAAATTTATG 


1020 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: 
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(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: MO 

5 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 56: 

Met Asn Vjil Asp Thr Asp lie lie Thr Leu The Arg Phe lie Leu Gin 
10 15 10 IS 

CXu Cln can Thr Val Ala Pro Thr Ala Thr GXy Clu Leu Ser Leu Leu 
20 25 30 

Leu Aan Ala Leu Gin Phe Ala Phe Lys Phe Xle Ala His Asn lie Arg 
35 40 45 

Arg Ala Glu Leu val Asn Leu lie Gly Val Ser Gly Ser Ala Asn Ser 
SO 55 60 

Thr Gly Asp Val Gin Lys Lys Leu Asp Val lie Gly Asp Glu He Phe 
65 -70 75 BO 

20 He Asn Ala Met Arg Ser Ser Asn Asn Val Lys Val Leu Val Ser Glu 

85 90 95 

Glu Gin Glu Asp Leu He Val Phe Pro Gly Gly Gly Thr Tyr Ala Val 
100 lOS 110 

Cys Thr Asp Pro He Asp Gly Ser Ser Asn He Asp Ala Gly Val Ser 
25 ' 115 120 125 

Val Gly Thr He Phe Gly Val Tyr Lys Leu Gin Glu Gly Ser Thr Gly 
130 135 140 

Gly He Ser Asp Val Leu Arg PrQ Gly Lys Glu Met Val Ala Ala Gly 
145 150 155 160 

Tyr Thr Met Tyr Gly Ala Ser Ala His Leu Ala Leu Thr Thr Gly His 
165 170 175 

Gly Val Asn Leu Phe Thr Leu Asp Thr Gin Leu Gly Glu Phe He Leu 

180 185 190 

Thr His Pro Aan Leu Lys Leu Pro Asp Thr Lys Asn He Tyr Ser Leu 
195 200 205 

Asn Glu Gly Tyr Ser Asn Lys Phe Pro Glu Tyr Val Gin Asp Tyr Ser 
210 215 220 

Lys Asp He Lys Lys Glu Gly Tyr Ser Leu Arg Tyr He Gly Ser Met 

225 230 235 240 

Val Ala Asp Val His Arg Thr Leu Leu Tyr Gly Gly Ho Phe Ala Tyr 
245 250 255 

Pro Thr Leu Lys Leu Arg Val Leu Tyr Glu Cys Phe Pro Met Ala Leu 
260 265 270 

Leu Met Glu Cln Ala Gly Gly Ser Ala Val Thr He Lys Gly Glu Arg 
275 280 285 

He Leu Asp He Leu Pro Lys Gly He His Asp Lys Ser Ser He Val 
290 295 300 

Leu Gly Ser Lys Gly Glu Val Glu Lys Tyr Leu Lys His Val Pro Lys 
305 310 315 320 



30 



35 



40 



45 



50 



55 
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(2) ZHFORMXTZON FOR SEQ ID NO: 57: 

(i) SEQUENCE characteristics: 

(A) LBN TH: 825 baa« paira 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLBCULB TYPE! cDNA 

(lit) HYPOTHETICAL: NO 

10 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 57: 





AACCCCACCT 


TCAAAGACAA 


AGAAGATTTC 


GTCAAfiCAAA 


CGAATGTCAG AGCAGAAAAG 


60 


15 


AACCAAGAAC 


TAATCAAATT 


TGCCCGTGAC 


AACCTTAACC 


ATTTACCATT CACCGAAAAA 


120 




GACGGAGGTG 


CATGGGAAAA 


CTATGAACGA 


ATGATCAGTG 


GTATGCTCTA CAACTGTTTA 


180 




CAAAAAGAAT 


TGGAAACAAC 


ACGTATGTCT 


TGCAGAOACT 


ACATGTTGGA CTACGGCAGT 


240 




TTCAGAACTA 


GAGATTATAA 


AACAACCCAA 


GAATTTCTTC 


ATGCAAAATA CAAACATTTA 


300 


20 


GAAAGTTTCA 


TTGGACATGT 


TGGCAAAAAT 


GCATTTATGG 


AATATCCAAT CTATTTTGAT 


360 




TATGGGTTTA 


ACACTTATTT 


GGGTGATAAT 


TTCTATTCCA 


ATTACAATTT GACAATTTTG 


420 




GATGTTTCCA 


TAGTCAGAAT 


TGGTAATAAT 


GTCAAGTGTG 


GTCCCAATGT ATCTATCCTT 


480 


25 


ACCCCAACAC 


ACCCAGTGGA 


TCCCACTTTG 


CGCTATGATC 


AATTGGAAAA TGCCTTGCCT 


540 


CTGACGGTGG 


GTAACGGGGT 


CTGGTTGTGT 


GGAAGCTGTA 


CCATTCTTGG TGGGGTGACA 


600 




GTAGGTGATG 


GCAGCATTGT 


GGCTGCTGGT 


CCAGTTGTCA 


ACAAGGACGT TCCACCAAAC 


660 




ACTGTASTTC 


CGGGAGTTCC 


TGCTAGGGTA 


GTTAAGCAGC 


TAGAACCTAG AGACCCTAAC 


720 


30 


TTTGACACTA 


TGGCAGTTTT 


GAAACAATAT 


GGTATGGGTT 


ATATAGATTA CTAATTAGAT 


780 




TTGATGTAAT 


GTACACGACT 


ACACTATTTG 


CTGGTGTCTG 


TTTTT 


825 



(2) INFORMATION FOR SEQ ID NO: 58: 
(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 206 amino acids 

<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

45 Met He Ser Gly Met Leu Tyr Asn Cya Leu Gin Lys Glu Leu Glu Thr 

1 S 10 

Thr Arg Met Ser Cys Arg Asp Tyr Met Leu Asp Tyr Gly Ser Phe Arg 
20 25 30 

Thr Arg Asp Tyr Lys Thr Thr Gin Glu Phe Leu Asp Ala Lys Tyr Lys 
50 35 40 

His Leu Glu Ser Phe He Gly His Val Gly Lys Asn Ala Phe Met Glu 
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50 55 60 

Tyr Pro lie Tyr Phe Asp Tyr Cly Ph« Aan Thr Tyr Leu Gly Asp Aan 
65 10 75 80 

5 Phe Tyr Ser Asn Tyr Aan Leu Thr lie Leu Asp Val Ser lie Val Arq 

85 90 d5 

He Gly Asn Asn Val Lya Cya Cly Pro Aan Val Ser He Leu Thr Pro 
100 105 110 

Thr His Pro Val Aap Pro Thr Leu Arg Tyr Asp Gin Leu Glu Asn Ala 
10 lis 120 125 

Leu Pro val Thr Val Gly Asn Cly Val Trp Leu Cys Gly ser Cys Thr 
130 135 140 

He Leu Gly Gly Val Thr Val Gly Asp Gly Ser He Val Ala Ala Gly 

145 150 155 160 

Ala Val Val Aan Lys Asp Val Pro Pro Asn The Val Val Ala Gly Val 
165 170 175 

Pro Ala Arg Val Val Lys Gin Leu Glu Pro Arg Asp Pro Asn Phe Asp 
180 l«5 190 

Thr Met Ala Val Leu Lys Gin Tyr Gly Met Gly Tyr He Asp 
195 200 205 

(2) INFORMATION FOR SZQ ID NO: 59: 

(i) SEQUENCB CHARACTERISTICS: 

(A) LENGTH: 1380 base pairs 
25 (&} TYPE: nucleic acid 

(C) STRANDEDNE55 : single 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: cDKA 
(iii) HYPOTHETICAL: NO 

30 



15 



20 



(xi) SEQUENCE DESCRIPTION: SBQ 10 NO: 59: 





AATTACAATC TCCTTTCTTA 


CTACCATATC 


CCATTAGTCT 


TATTGTCATT 


GTAGATAXTG 


60 


35 


ATAATGGTTA AAGGATTGGT 


TTTCATTTTT 


TGTGTAATGA 


ATGAGCCAAA 


ATAAAAAATC 


120 




AATTCGATGC GATGCAAT6A 


AGTTTAATAA 


AATTTTTTTT 


TTTCTTTATT 


TCTTTTAATC 


180 




AACCCATCAA TCATTAAATT 


GAATCAATAC 


CTACCATTAA 


CATACTTCTA 


TATACATATA 


240 


40 


TATATATAAC AAAATATCAT 


GGGGAAGATA 


ACAACTAGTG 


ATACTAAAAC 


AAAACAACGT 


300 




CATAATCCAT TATTAAAAGA 


TATTTCATCC 


CAAGGTGGGA 


ATTTAAGAAC 


CGTTCCAAGA 


360 




TCATCATCAT CATCATCATC 


ACAAAAGAAG 


AAATCATCAA 


AGAAACAAAG 


ACATAACGAT 


420 




GAAGACGACG AAGAAAATGG 


TGGCGGTGAA 


GGATTTTTAG 


ATGCTTCTAG 


TTCAAQAAAG 


480 


45 


ATTTTACAAT TGGCAAA^.GA 


ACAACAAGAT 


GAACTTGAAC 


AAGAAGATGA 


AATACAAAAT 


540 




AAACCTTCAT TTGCTCAATC 


ATTTAAAAAT 


CAACAAATAG 


ATAGTGAAGA 


AGAAGAAGAG 


600 




GAAGATGAGT ATTCAGATTT 


TGAAGAAGAA 


GAAGAAGTTG 


AAGAGATAGT 


ATATGATGAA 


660 


50 


GAAGATGCAG AAGTTGATCC 


CAAAGATGCA 


GAATTATTTA 


ATAAATATTT 


CCAATCCAAC 


720 




GGTGAAGCTA ATAATAATGA 


TGATGATAAT 


TCATTTCAAC 


CAACAATAAA 


TTTAGCTGAT 


780 



55 



•68- 
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AAAATCTTAG CCAAAATTCA AGAAAAAGAA TCCCAACAAC AACAACAACA ACAAAGCTCT 840 

CCAGATAATA GTAATGAAGA TGCCGTATTG TTACCACCAA AAGTCATTTT AGCTTATGAA 900 

AAAATTGGTC AAATTTTATC AACTTATACT CATGGGAAAT TACCtAAATT ATTTAAAATT 960 

TTACCAAGTT tAAAAAATTG CCAAGATGTA TTATACCTGA CAAATCCAAA TACTTGGACT 1020 

CCTCATGCCA CATATGAAfiC AACXAAATTA TTTGTGTCGA ATTTATCAAG TAATGAAGCT 1080 

ACAGTTTTCA TTGAAACtAT CTTGTTGCCA CGATTCCCTG ATTCTATTGA AAATTCCGAT 1140 

GATCATtCAT TAAATTATCA TATTTATCGA GCATTAAAAA AATCATTATA TAAACCAGGA 1200 

CCTTTTTTCA AACGCTTCTT CTTACCTTTA CTCGATCCTT ATTGTTCTGT ACGTGAACCC 1260 

ACTATTCCTC CTTCACTGTT AACTAAACTT TCTCTCCCTG TTTTACATTC ATGTCATTAT 1320 

TGTGCCGTAC TGATGAATAA AAAACGAGAA TCACCTGTAT TTGTCCTACG GCGAATATAA 1380 

(2) INFORMATION FOR SEQ ID NO: 60: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: v 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(lii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met GlY Lys He Thr Thr Ser Aap Thr Lya Thr Lys Gin Arg His Asn 
15 10 15 

Pro Leu Leu lys Asp lie Ser Ser Gin Gly Gly Asn Leu Arg Thr Val 
20 25 30 

Pro Arg Ser Ser Ser Ser Ser Ser Ser Gin Lys Lys Lys Ser Ser Lys 
35 40 45 

Lys Gin Arg His Asn Asp Glu Asp Asp Glu Glu Asn Gly Gly Gly Glu 
50 55 60 

Gly Phe Leu Asp Ala Ser Ser Ser Arg Lys He Leu Gin Leu Ala Lys 
65 70 75 80 

Glu Gin Gin Asp Glu Leu Glu Gin Glu Asp Glu He Gin Asn Lys Pro 
85 90 95 

Ser Phe Ala Gin Ser Phe Lys Asn Gin Gin lie Asp Ser Glu Glu Glu 
100 105 110 

Glu Glu Glu Asp Glu Tyr Ser Asp Phe Glu Glu Glu Glu Glu Val Glu 
115 120 125 

Glu He Val Tyr Asp Glu Glu Asp Ala Glu Val Asp Pro Lys Asp Ala 
130 135 140 

Glu Leu Phe Asn Lys Tyr Phe Gin Ser Asn Gly Glu Ala Asn Asn Asn 
145 150 155 160 

Asp Asp Asp Asn Ser Phe Gin Pro Thr He Asn Leu Ala Asp Lys He 
165 X70 175 

Leu Ala Lys He Gin Glu Lys Glu Ser Gin Gin Gin Gin Gin Gin Gin 
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180 185 190 

Ser Ser Pro Asp Asn Ser Aan Glu Aap Ala Val Leu Leu Pro Pro Lys 
19S 200 205 

Val lie Leu Ala Tyr Glu Lys lie Gly GXn He Leu Ser Thr Tyr Thr 
210 215 220 

His Gly Lys Leu Pro Lys Leu Phe Lys He Leu Pro Ser Leu Lys Asn 
225 230 235 240 

Trp Gin Asp val Leu Tyr Val Thr Aan Pro Asn Ser Trp Thr Pro His 
245 250 255 

Ala Thr Tyr Glu Ala Thr Lys Leu Phe Val Ser Asn Leu ser Ser Asn 
2$0 265 270 

Glu Ala Thr Val Phe Zle Glu Thr He Leu Leu Pro Arg Phe Arg Asp 
275 200 285 

Ser He Glu Asn Ser Asp Asp His Ser Leu Asn Tyr His He Tyr Arg 
290 295 300 

Ala Leu Lys Lys ser Leu Tyr Lys Pro Gly Ala Phe Phe Lys Gly Phe 
305 310 315 320 

Leu Leu Pro Leu Val Asp Gly Tyr Cys Ser Val Arg Glu Ala Thr He 
325 330 335 

Ala Ala Ser Val Leu Thr Lys Val Ser Val Pro Val Leu His Ser Cys 
340 345 350 

His Tyr Cys Gly Val Ser Met Asn Lys Lys Arg Glu Ser Pro Val Phe 
355 360 365 

Val Leu Arg Arg He 
370 

(2) iKPOmVTION FOR SEQ ID NO: 61: 

<i) SEQUEKCC CKARACTCRI Sizes: 

(A) LENGTH: 823 base pairs 
IR) TYPE: nucleic acid 
fC) STRANDBDNE5S: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: MO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



40 


AACCAACAAT GAGTCAAGTC 


GCTCCAAAGT 


GGTACCAATC AGAAGACGTT CCAGCTCCAA 


60 




AACAAACCAG AAACACTGCT 


CGTCCACAAA 


AATTACCTGC CTCTTTACTC CCAGGTACCG 


120 




TTTTAATTTT ATTOGCCGGT AGATTCAGAG 


GTAAAAGAGT TGTTTACTTG AAGAACTTGC 


180 


45 


AAGACAACAC CTTATTGGTT 


TCTGGTCCAT 


TCAAAGTCAA TGCTGTTCCA TTGAGAAGAG 


240 


TTAACGCTAG ATACGTTATC 


GCCACCTCCA 


CCAAAGTCAA COTTTCTGGT GTTGATGTTT 


300 




CTAAATTCAA CGTCGAATAC 


TTTGCTAGAG 


AAAAATCTTC TAAATCTAAA AA^TCCGAAG 


360 




CTGAATTCTT CAATGAATCT 


CAACCATVAjGA 


AACAAATCAA AGCTGAAAGA GTTGCTGACC 


420 


50 


AAAAATCTGT CGATGCTGCT 


TTATTAAGTG 


AAATCAAAAA GACCCCATTA TTGAAACAAT 


480 




ACTTGGCCGC TTCATTCTCT 


TTGAAGAACG 


GTGACAGACC ACACTTGTTA AAATTTTAAT 


540 



.70- 
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(ill) HYPOTHETICAI^: NO 
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25 



30 



40 



415 



5 (xi) SEQUEKCE DESCRIPTION: SBQ ID NO: 63: 

AACATTAAAG CAAGATCMAA AACGATAAAG GTCAATTAGT TGAATTATAC GTCCCAAGAA 60 

AATGTTCTGC TACCAACAGA ATCATTAAAG CCAAAG\TCA CGCTTCTGTT CAAATCTCAA 120 

TTGCTAAAGT TGATGAAGAC GGTAGAGCTA TTGCTGGTGA AAACATCACT TACGCTTTAA 180 

GTGCTTACGT TAGAGGTAGA GGTGAAGCT6 ATGACTCATT AAACAGATTG GCTCAACAAG 240 

ACGGTTTATT GAAGAACCTC TGGTCTTACT CTCCTTAAGA GAATAQAAGA ATAGACAAAA 300 

TTGATAATTG GGTATTTTAA GAAATTACTT TTTTTATATT GCAAATTAAT TTTAATCTTT 360 

^5 CTTCTGTGTA TATTTAATGT CTTAACATAA TAAAAAAAAA GAATAGAAAT GGTTT 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 
20 (C) STRANDBDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(Hi) HYPOTHETICAL: MO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Met Gl\i Aan Asp Lys Gly Gin Leu Val Glu Leu Tyr Val Pro Arg Lys 

1 5 10 15 

CVS Ser Ala Thr Asn Arg lie lie Lys Ala Lys Asp His Ala Ser Val 
^ 20 25 30 

Gin He Ser He Ala Lys Val Aap Glu Asp Gly Arg Ala He Ala Gly 
35 40 45 

Glu Aan lie Thr Tyr Ala Leu Ser Gly Tyr Val Arg Gly Arg Gly Glu 
35 5Q 55 60 

Ala Aap Asp Ser Leu Asn Arg Leu Ala Gin Gin Asp Gly Leu Leu Lys 
65 70 75 80 



Asn Val Trp Ser Tyr Ser Arg 
85 

(2) INFORMATION FOR SBQ ID NO: 65: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNESS: single 
''^ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

^ (ix) FEATURE: 

(A) NAME/KEY: nii3C_feature 

(B) LOCATION: 74 9 



55 
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(P) OTHER INFORMATION: /note- "N - A or T or C or G" 
(Xi) SEQUENCE DESCRIPTION: SEO ID N : 65: 



5 


ACCAT6TGTC 


AAATTGCTTG 


GTC6TGTCCT 


TTCACCACAC 


ATTTTTTTGG ATTAAATTTC 


60 




TCGCACGCTC 


AAAAAATGAC 


TTCGACAAAA 


AGCAATGCCA 


CTCTTCCTAC AATTAATTCC 


120 




CTCCGCCCCT 


TCCTTTTCAT 


ATACTATCTC 


CCTTCCTTCT 


TCCTTCTCCT TTTATTTTTT 


180 


10 


CAATTATTAC 


AATCTTATGT 


CATTTAAAGG 


ATTCAAAAAG 


GGTGTCCTTA GGGCCCCACA 


240 


GACAATGCGT 


CAGAAATTCA 


ACATGGGAGA 


AATCACCCAA 


GATGCTGTTT ATCTCGATGC 


300 




TGAAAGAAGA 


TTCAAAGAAA 


TCGAAACGGA 


AACAAAAAAG 


TTGAGTGAAG AATCCAAGAA 


360 




ATATTTCAAT 


GCTGTCAATG 


GGA7GTTAGA 


TGAACAAATT 


GATTTTGCCA AAGCCCTGGC 


420 


15 


TGAGATTTAT 


AAACCAATCA 


GTGGTAGATT 


ATCGGACCCC 


AGTGCTACGG TACCAGAAGA 


480 




TAACCCACAA 


GGTATTGAAG 


CATCGGAACT 


GTACCAAGCA 


GTGGTTAAAG ATCTCAAAGA 


540 




TACCTTAAAA 


CCCGATTTGG 


AATTGATTGA 


AAAAAGAATT 


GTTGAACCAG CACAAGAATT 


600 


20 


ATTGAAGATT 


ATACAAGCTA 


TAAGGAAAAT 


GTCAGTGAAA 


AGAGACCATA AACAATTGGA 


660 




TTTGGATCGT 


CATAAGAGAA 


ATTTTTCTAA 


ATATGAACTG 


AAGAAAGAAA GAACTGTTAA 


720 




AGATGAAGAA 


AAAATGTTCA 


GTGCTCAANC 


AGAAGTAGAA 


ATTGCTCAAC AAGAGTACGA 


780 


25 


TTATTATAAT 


GATTTGTTAA 


AGAATGAATT 


CCCAGTTTTG 


TTTCAAATGC AAAGTGATTT 


840 


TATCAAACCA 


TTGTTTGTTT 


CATTCTATTA 


CATCCAGTTG 


AATATTTTCT ACACATTATA 


900 




CACTAGAATG 


GAAGAGTTGA 


AAATTCCATA 


TTTTGATTTG 


TCTACTGATA TTGTCGAAGC 


960 




TTATACTGCC 


AAGAAGGGGA 


ACATTGAGGA 


ACAAACCGAT 


GCTATTGGAA TCACTCATTT 


1020 


30 


CAAAGTCGGG 


CATGCCAAAT 


CCAAATTGGA 


AGCCACTAAA 


AGAAGACATG CTGGTATGAA 


1080 




TAGTCCACCT 


CCTACCGGTG 


CCAGCTCTAT 


TGCATCTACA 


GGTACTGGTG GTGAATTACC 


1140 




TGCATACTCC 


CCAGGAGGTT 


ACAACCAACC 


ATATGGTGAT 


AGCAAGTATC AACCACCATC 


1200 


35 


TTCTCCAGCA 


ACATACCAAT 


CTCCAGTAGT 


AGCAGCCACT 


GCTCAATCTC CAGCTACTTA 


1260 




TCAATCGCCA 


GTGGCTACTG 


GACAACCTCC 


ATCATATTTA 


CCACAAACTC CAGCCAGTGC 


1320 




TCCACCACCA 


CAAGTTGGTA 


GTGGCCTTCC 


AACATGCACG 


GCTTTATACG ATTATACTGC 


1380 


40 


ACAAGCCCAG 


GGTGACTTGA 


CTTTCCCTGC 


AGGAGCTGTT 


A7TGAAATTA TACAAAGAAC 


1440 


CGAAGATGCC 
TAATTATGTG 


AACGGATGGT 
CAATTATAG 


GGACTGGTAA 


ATACAATGGT 


CAAACCGGTG TGTTCCCTGG 


1500 
1519 



(2) INFORKATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 440 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(V) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCB DESCRIPTION: SEQ ID NO: 66: 

M«C Ser Phe Lys Gly Phe Lya Ly» Gly Val Leu Arg Ala Pro Gin Thr 
X 5 10 15 

Met Arg Gin Lys Phe Aan Htt Gly Glu He Thr Gin Aap Ala Val Tyr 
20 25 30 

Leu Asp Ala Glu Arg Arg Phe Lys Glu He Glu Thr Glu Thr Lys Lys 
35 40 45 

Leu Ser Glu Glu Ser Lya Lys Tyr Phe Asn Ala Val Asn Gly Met Leu 
50 S5 60 

Asp Glu Gin He Asp Phe Ala Lys Ala Val Ala Glu He Tyr Lys Pro 

65 • 70 75 80 

He Ser Gly Arg Leu Ser Asp Pro Ser Ala Thr Val Pro Glu Asp Asn 
65 90 95 

Pro Gin Gly He Glu Ala Ser Glu Ser Tyr Gin Ala Val Val Lys Asp 
100 105 110 

Leu Lys Asp Thr Leu Lys Pro Asp Leu Glu Leu He Glu Lys Arg He 
115 120 125 

Val Glu Pro Ala Gin Glu Leu Leu Lys He He Gin Ala He Arg Lys 
130 135 140 

Met Ser Val Lys Arg Asp His Lys Gin Leu Asp Leu Asp Arg His Lys 

145 ISO 155 160 

Arg Asn Phe Ser Lys Tyr Glu Ser Lys Lys Glu Arg Thr Val Lys Asp 
165 170 175 

Glu Glu Lys Met Phe Ser Ala Gin Xaa Glu Val Glu He Ala Gin Gin 
180 185 190 

Glu Tyr Asp Tyr Tyr Asn Asp Leu Leu Lys Asn Glu Leu Pro Val Leu 
195 200 205 

Phe Gin Met Gin Ser Asp Phe He Lys Pro Leu Phe Val Ser Phe Tyr 
210 215 220 

Tyr Met Gin Leu Asn He Phe Tyr Thr Leu Tyr Thr Arg Met Glu Glu 
225 230 235 240 

Leu Lys He Pro Tyr Phe Asp Leu Ser Thr Asp He Val Glu Ala Tyr 
245 250 255 

Thr Ala Lys Lys Gly Asn He Glu Glu Gin Thr Asp Ala He Gly He 
260 265 270 

Thr His Phe Lys Val Gly His Ala Lys Ser Lys Leu Glu Ala Thr Lys 
275 280 285 

Arg Arg His Ala Ala Met Asn Ser Pro Pro Pro Thr Gly Ala Ser Ser 
290 295 300 

He Ala Ser Thr Gly Thr Gly Gly Glu Leu Pro Ala Tyr Ser Pro Gly 
305 310 315 320 

Gly Tyr Asn Gin Pro Tyr Gly Asp Ser Lys Tyr Gin Pro Pro Ser Ser 
325 330 335 

Pro Ala Thr Tyr Gin Ser Pro Val Val Ala Ala Thr Ala Gin Ser Pro 
240 345 350 

Ala Thr Tyr Gin Ser Pro Val Ala Thr Gly Gin Pro Pro Ser Tyr Leu 

355 360 365 

Pro Gin Thr Pro Ala Ser Ala Pro Pro Pro Gin Val Gly Ser Gly Leu 



.74- 



35 



45 



50 



55 
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370 375 380 

Pro Thr cys Thr Ala Leu Tyc Aap Tyr Thr Ala Gin Ala Gin Gly Asp 
335 390 395 400 

Leu Thr Phe Pro Ala Gly Ala Val He Clu He He Gin Arg Thr Glu 
405 4X0 415 

A3P Ala Aan Gly Trp Trp Thr Gly hys Tyr Asn Gly Oln Thr Gly Val 
420 425 430 



Phe Pro Gly Aan Tyr Val Gin teu 
10 435 440 

(2) IKFORMATIOM FOF SBQ ID MO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 855 base pairs 

(B) TYPE: nucleic acid • 
^5 (C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

20 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 67: 
ATAATTTTCA GAAAGAGACT AGATTCTGAT AGAAATATAG ACGCATCACT ATATTTTGGA 60 
25 AATATAGATC CACAAGTTAC GCAGTTGTTA ATGTATGAGT TGTTCATCCA ATTTGGTCCC 120 

CTCAAATCAA TCAATATCCC AAACGATCGT ATATTGAAAA CACACCAGGG GTATGGATTT 180 
GTCGAATTTA AAAACTCACC AGATGCCAAA TATACTATGG AAATACTACG AGGAATAAGA 
CTTTATGGAA AAGCATTGAA ATTGAAACGA ATTGATGCCA AGTCTCAGTC ATCAACAAAC 
AACCCAAATA ATCAAACAAT AGGAACATTT GTACAATCAG ATTTGATCAA TCCAAATTAC 360 
ATAGATGTTG GAGCTAAACT ATTTATCAAC AATCTTAATC CATTGGTCGA TGAATCCTTT 420 
TTAATGGATA CGTTTAGTAA GTTTGGAACC CTTATAAGAA ACCCAATAAT TAGACGTGAT 480 
TCAGAGGGAC ACTCTTTCGG ATACGGATTT CTTACGTACG ATGACTTTGA AAGTAGTGAT 
TTATCCATAC AAAAAATGAA CAACACGATT TTGATGAATA ACAAAATTCC TATCACTTAT 
GCATTCAAGG ATCTGAGTGT TGATCCGAAG AAATCCCGGC ATGGAGATCA AGTGGAGCGG 
AAATTGGCTG AAAGTGCCAA AAAGAATAAT TTGTTGGTAA CGAAAACTTC TAAGGCAGGT 
40 /ia»r:aaTftaA rrACATAAAG TGACCAAACC GTGAGACAAT 780 



ACGACGAAGG GAAATAAAAG GAAGAATAAA CCACATAAAG TGACCAAACC GTGAGACAAT 
GAGTTAGCTC CCCCTTTCAA AATAAGTAGA GTATCACCAT AGTTTATGAA ACAATTGATA 
TATTAAGCTT CTCTG 

(2) INFORMATION FOR SEQ ID NO: 68: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acid5 

(B) TYPE: amino acid 

(C) STRANDEDHBSS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



240 
300 



540 
600 
660 
720 



840 
855 
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(xi) SEQUENCB DESCRIPTION: SZQ ID NO: 68: 

He He Phe Arg Lys Arg Leu Asp Ser Asp Arg Asn He Aap Ala Ser 
15 10 15 

Leu ryr Phe Gly Aan He Aap Pro Gin Val Thr Glu Leu Leu Met Tyr 
20 25 30 

Glu Leu Phe He Gin Phe Gly Pro vel Lys Ser He Aan Met Pro Lya 
35 40 45 

Asp Arg He Leu Lys Thr Hia Gin Gly Tyr Gly Phe Val Glu Phe Lys 

SO 55 60 

Asn Ser Ale Asp Ala Lys Tyr Thr Met Glu He Leu Arg Gly He Arg 
65 70 75 BO 

Leu Tyr Gly Lys Ala Leu Lys Leu Lys Arg He Asp Ala Lys Ser Gin 

85 90 95 

Ser ser Thr Asn Asn Pro Asn Asn Gin Thr He Gly Thr Phe Val Gin 
100 105 110 

Ser Asp Leu He Asn Pro Asn Tyr He Asp Val Gly Ala Lys Leu Phe 
115 120 125 

He Asn Asn Leu Asn Pro Leu Val Asp Glu Ser Phe Leu Met Asp Thr 
130 135 140 

Phe Ser Lys Phe Gly Thr Leu He Arg Asn Pro He He Arg Arg Asp 
145 150 155 160 

Ser Glu Gly His Ser Leu Gly Tyr Gly Phe Leu Thr Tyr Asp Asp Phe 
165 170 175 

Glu Ser Ser Asp Leu Cys He Gin Lys Met Asn Asn Thr He Leu Met 
180 IBS 190 

Asn Asn Lys He Ala He Ser Tyr Ala Phe Lys Asp Ser Ser Val Asp 
195 200 205 

Gly Lys Lys Ser Arg His Gly Asp Gin Val Glu Arg Lys Leu Ala Glu 
210 215 220 

Ser Ala Lya Lys Asn Asn Leu Leu Val Thr Lys Thr Ser Lys Ala Gly 
225 230 235 240 

Thr Thr Lys Gly Asn Lys Arg Lys Asn Lys Pro His Lys val Thr Lys 
245 250 255 

Pro 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1685 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCB DESCRIPTION: SEQ ID NO: 69: 
CTGTTTATTA AATGGATATA TGTTAAACCA TGAACTTCGG TTTATCAGAA AAATTGGTGC 
TGGTACCTAT GCTTTGATTT ACCTTGTGGA AAATATCTAC ACIAAACAAC AATTTGCTGC 
TAAAATGGTT CTTGAACACC CATTACTCAA ACAAAAfiCAA CAACAACAAC AAAGTCATCA 
TGGACATAAA GGAGAATCTA GTATGAACAA ACAAATAAXA CTGCAAGAAT TTTATCAATA 
TTTTTTAAAC AATAGTATGC CACAACCACG AAATTTCGAC TTGAATTACC TTCGAfiACAA 
CGGACATGAT TGCCCCTTTT TGACTGAAAT CTCATTACAT TTAAAAOTAC ATCAACACCC 
AAACATAGCG ACTATTCATC AAGTATTAAA CATTGAAGAT TTTGCCATAA TAATATTGAT 
CGATCATTTT GAGCAAGGAG ATTTGTTCAC TAATATCATT GATAGACAAA TATTCACCAA 
TAATAGTCAT AGAAAAGTTC CAAGAACAGA TTTTGAAACC CAATTATTAA TGAAGAATGC 
CATGTTACAA TTGATAGAAG CCATTGAATA TTGTCACGAA AATAATATTT ACCATTGTGA 
TTTAAAACCA GAAAACATTA TGGTTAGATA TAATCCATAC TATGTTCCTC CAACTATCAA 
TAACAATAAT AACAATCGAG AAGATGATTT ATGCTATGCC AACAGTATTA TTGACTATAA 
TGAATTACyVC CTC6TCTTGA TTGATTTTGG TTTAGCTATG GACTCTGCTA CCATTTGTTG 
TAATTCATGT CGTGGATCGT CATTTTACAT GGCACCAGAA AGAACCACCA ATTATAACAC 
CCATCGITTA ATCAACCAAT TAATTGATAT GAATCAATAT GAGTCAATTG AAATCAATGG 
GACAACAGTG ACAAAATCAA ACTGTAAATA TTTACCTACA TTGGCTGGGG ATATTTGGTC 
ATTGGGAGTA TTGTTCATTA ATATCACTTG TTCAAGAAAC CCATGGCCCA TTGCATCATT 
TGATAATAAT CAAAATAATC AAGTGTTTAA GAATTATATG TTGAATAATA ACAAGGCTGT 
TTTGAGCAAA ATCTTACCCA TTTCCTCACA ATTTAATCGC TTATTAGATA GAATTTTCAA 
ATTGAATCCT AATGATAGAA TAGATTTACC AACTTTATAC AAAGAAGTTA TTCGTTGTGA 
TTTCTTCAAA GATGATCATT ACTACTATGC CCAACATCAA CATCATCACA ATCACAATCA 
AATCAATAAT GCTTACAATC ACTATCAGAA ACAACCTAAT CAAGCAAGAC CTACTGCAAA 
CCAACAATTG TATACACCAC CGGAAACCAC CACTTATAAT TCATACGCTA GTGATATGGA 
AGAAGATGAA ATTAGTGATG ATGAGTTTTA TTCTGATGAA GAAGATGAAG ATATTGAAGA 
CTATGAAGAG GAAGAGGAAG AGTATTTTCG TAATGAGCAA CAACAACAAC AGCAAGTCAC 
AACAGTGAAT GGTAATTTTG CTCAACTTAA AGGTACCTGT TATTACGATA CCAAAACCAA 
AACAACTACA TATATAAAAC CACCAGCTGC ATATACTTTA GAGACGCCTA GTCAAAGTGT 
TGAATACTGT TAACTTGTAC ACATAAATAA TTAATGACAA TTAATAATAA CGATTAATAA 
TATAG 

(2) INFORMATION FOR SEQ ID NO: 70: 

<i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 537 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknovm 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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<xi) SEQUENCE DESCRIPTION: SBQ ID NO: 70: 

Met Leu Aan Hia Glu Leu Arg Phe He Arg Lys He Gly Ala Gly Thr 
1 5 10 15 

Tyr Cly Leu He Tyr Leu Val Glu Aan He Tyr Thr Lys Gin Gin Phe 
20 25 30 

Ala Al* Ly» Met Vel Leu GXu Gin Pro Leu Leu Lys Gin Lys Gin Gin 
35 40 45 

Gin Gin Gin Ser His His Gly His Lys Gly Glu Ser Ser Met Asn Lys 
SO 55 60 

Gin He He Ser Gin Glu Phe Tyr Gin Tyr Phe Leu Asn Asn Ser Met 
65 70 75 80 

Pro Gin Pro Arg Asn Leu Asp Leu Asn Tyr Leu Arg Asp Asn Gly His 
85 90 9S 

Asp cys Pro Phe Leu Thr Glu He Ser Leu His Leu Lys Val His Gin 
100 105 110 

His Pro Asn He Ala Thr He His Gin Val Leu Asn He Glu Asp Phe 
115 120 125 

Ala He He He Leu Met Asp His Phe Glu Gin Gly Asp Leu Phe Thr 
130 135 140 

Asn He He Asp Arg Gin He Phe Thr Asn Asn Ser' His Arg Lya Val 
145 150 155 160 

Pro Arg Thr Asp Phe Glu Thr Gin Leu Leu Met Lys Asn Ala Met Leu 

165 170 175 

Gin Leu He Glu Ala He Glu Tyr Cys His Glu Asn Asn He Tyr His 
180 185 150 

Cys Asp Leu Lys Pro Glu Asn He Met Val Arg Tyr Asn Pro Tyr Tyr 
195 200 205 

Val Arg Pro Thr He Asn Asn Asn Asn Asn Asn Gly Glu Asp Asp Leu 

210 215 220 

Cvs Tyr Ala Asn Ser He He Asp Tyr Asn Glu Leu His Leu val Leu 
225 230 235 240 

He Asp Phe Cly Leu Ala Met Asp Ser Ala Thr He Cys Cys Asn Ser 
245 250 255 

Cys Arg Gly Ser Ser Phe Tyr Met Ala Pro Glu Arg Thr Thr Asn Tyr 

260 265 270 

Asn Thr His Arg Leu He Asn Gin Leu He Asp Met Asn Gin Tyr Glu 
275 280 285 

ser He Glu He Asn Gly Thr Thr Val Thr Lys Ser Asn Cys Lys Tyr 
290 295 300 

Leu Pro Thr Leu Ala Gly Asp He Trp Ser Leu Cly Val Leu Phe He 

305 310 315 320 

Asn He Thr Cys Ser Arg Asn Pro Trp Pro He Ala Ser Phe Asp Asn 
325 330 335 

Aan Gin Asn Asn Glu Val Phe Lys Asn Tyr Met Leu Asn Asn Asn Lya 
340 345 350 

Ala Val Leu Ser Lys He Leu Pro He Ser Ser Gin Phe Asn Arg Leu 
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355 360 365 

Leu Aap Arg Xle Phe Lys Leu Aan Pro Aan Aap Arg He Asp Leu Pro 
370 375 380 

5 Thr Leu Tyr Lys Glu Val He Arg Cys Asp Phe Phe Lys Asp Asp His 

385 390 395 400 

Tyr Tyr Tyr Ale Gin His Gin His His His Asn His Asn Gin He Asn 
^ ^ 405 4X0 415 

Asn Ala Tyr Asn His Tyr Gin Lys Gin Pro Asn Gin Ala Arg Pro Thr 
10 420 425 430 

Ala Asn Gin Gin Leu Tyr Thr Pro Pro Glu Thr Thr Thr Tyr Asn Ser 
435 440 445 

Tyr Ala Ser Asp Met Glu Glu Asp Glu He Ser Asp Asp Glu Phe Tyr 
450 455 460 

Ser Asp Glu Glu Asp Glu Asp He Glu Asp Tyr Glu Glu Glu Glu Glu 
465 470 475 480 

Glu Tyr Phe Cly Asn Glu Gin Gin Gin Gin Gin Gin Val Thr Thr Val 
485 490 495 

Asn Gly Asn Phe Gly Gin Val Lys Gly Thr Cys Tyr Tyr Asp Thr Lys' 
500 505 510 

Thr Lya Thr Thr Thr Tyr He Lys Pro Pro Ala Ala Tyr Thr Leu Glu 
515 520 525 

Thr Pro Ser Gin Ser Val Glu Tyr Cys 
530 535 

(2) INFORMATION FOR SBQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: MO 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AACCAATTTT AGAAACAATG GCTCGTCAAX TTTTCGTACG TGGTAACTTC AAAGCTAACG 60 
GTACCAAACA ACAAATCACT TCAATCATCG ACAACTTGAA CAAGGCTGAT TTACCAAAGG 120 
ATGTCGAACT TGTCATTTGT CCACCCGCCC TTTACCTTGG TTTAGCTGTA GAGCAAAACA 180 
AACAACCAAC TGTTGCCATT GGTGCTCAAA ATGTTTTTGA CAAGTCATGT GGTGCTTTCA 240 
^ CTGGTGAAAC CTGTGCTTCT CAAATCTTGG ATGTTGGTGC CAGCTGGACT TTAACTGGTC 300 

ACACTGAAAG AAGAACCATT ATCAAAGAAT CCGATGAATT CATTGCTGAA AAAACCAACT 360 
TTGCCTTGGA CACTGGTGTC AAAGTTATTT TATGTATTGG TGAAACCTTA GAGGAAAGAA 420 
AAGGTGGTGT CACTTTGGAT GTTTGTGCCA GACAATTGGA TGCTGTTTCC AAGATTGTTT 480 
CTGATTGGTC AAACATTGTT GTTGCTTACG AACCTGTTTG GCCAATTGGT ACTGGTTTAG 540 
CCGCTACCCC AGAAGATGCT GAAGAAACCC ACAAAGGTAT TAGAGCTCAT TTGGCCAAGA 
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CCATTCCTGC CGAACAAGCT GAAAAAACCA GAATCTTGTA CGGTGCTTCA OTTAACGGTA 660 
AGAACGCTAA GGATTTCAAA GACAAAGCAA ATGTTGA.TGG TTTCTTACTC GGTGGTGCTT 720 
CATTAAAACC AGAATTTGTT GATATCATCA AATCTAGATT ATAAACAGTA TATTAAAAAC 780 

5 

TATATGCCTA TAGAATTTAG CATGTTCTTG IGAATTTGTA ATGAATCTAT AAAAATGTGC 840 
TCATGAAC 

(2) IKFORKATION FOR SEQ ID KO: 72: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 248 amino acids 

(B) TYPE: anino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 
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Ui) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 72: 

20 

Met Ala Arg Gin Phe Phe Val Gly Gly Aan Phe Lya Ala Asn Gly Thr 
15 10 15 

Lys Gin Gin lie Thr Ser lie lie Asp Aan Leu Asn Lys Ala Asp Leu 
20 25 30 

25 Pro Lys Asp Val Glu Val Val He Cys Pro Pro Ala Leu Tyr Leu Gly 

35 40 45 

Leu Ala Val Glu Gin Asn Lys Gin Pro Thr Val Ala lie Gly Ala Gin 
50 ' 55 60 

Asn Val Phe Asp Lys Ser Cys Gly Ala Phe Thr Gly Glu Thr Cys Ala 
30 65 70 75 80 

Ser Gin He Leu Asp Val Gly Ala Ser Trp Thr Leu Thr Gly His Ser 
85 90 95 



35 



Glu Arg Arg Thr He He Lys Glu Ser Asp Glu Phe He Ala Glu Lys 
100 105 110 

Thr Lys Phe Ala Leu Asp Thr Gly Val Lys Val He Leu Cys He Gly 
H5 120 125 

Glu Thr Leu Glu Glu Arg Lys Gly Gly Val Thr Leu Asp Val Cys Ala 
130 135 140 

40 Arg Gin Leu Asp Ala Val Ser Lys He Val ser Asp Trp Ser Asn He 

X45 150 155 160 

Val Val Ala Tyr Glu Pro Val Trp Ala He Gly Thr Gly Leu Ala Ala 
165 no 175 

Thr Pro Glu Asp Ala Glu Glu Thr His Lys Gly He Arg Ala His Leu 
45 180 165 190 

Ala Lys Thr He Gly Ala Glu Gin Ala Glu Lys Thr Arg He Leu Tyr 

195 200 205 

Gly Gly Ser Val Aan Gly Lys Asn Ala Lys Asp Phe Lys Asp Lys Ala 
210 215 220 
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Asn Val Asp Gly Phe Leu val Gly Gly Ala Ser Leu Lys Pro Glu Phe 
225 230 235 240 
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Val Asp lie Il6 Ly9 Ser Arg Leu 
245 



Claims 

I. A nucleic acid molecule encoding a polypeptide which is critical for survival and growth of the yeast Candida 
albicans and which nucleic acid molecule comprises any of the sequences of nucleotides in S^^^^^^ '?o^ Vo 

3, 5. 6. 8 to 11. 13. 15, 16. 18, 20. 21, 23. 25 to 29. 31, 35. 37. 39. 41. 43, 45. 47. 49. 51. 53. 55. 57, 59. 
61, 63, '65, 67. 69 and 71. 

2. A nucleic acid molecule encoding a polypeptide which is critical for sun/ival and growth of the yeast Cand^^ 
albicans and which nucleic acid molecule comprises any of the sequences of nucleotides in Sequence ID Numbers 28, 
35, 37 and 39 and fragments or derivatives of said nucleic acid molecules. 

3 A nucleic acid molecule encoding a polypeptide which is critical for survival and growth of the yeast Candida 
albicans and which polypeptide has an amino acid sequence according to the sequ^^^ ^A^Z ^^fn^R^^^ 

Numbers 4, 7, 12. 14. 17. 19. 22. 24. 30. 32 to 34, 36. 38, 40. 42. 44, 46, 48, 50. 52, 54. 56, 58. 60, 62, 64. 

66, 68, 70 and 72. 

4. A nucleic acid molecule according to any of claims 1 to 3 which is mRNA. 

5. A nucleic acid molecule according to any of claims 1 to 3 which is DNA/ 

6. A nucleic acid molecule according to claim 5 which is cDNA. 

7. A nucleic acid molecule capable of hybridising to the molecules according to any of claims 1 to 5 under high 
stringency conditions. 

8. A polypeptide having the amino acid sequences of any of Sequence ID Numbers 4. 7, 12, 14. 17. 19, 22. 24. 30, 32 
to 34. 36. 38. 40. 42, 44. 46, 48. 50. 52. 54. 56. 58. 60. 62. 64, 66. 68. 70 and 72. 

9. A polypeptide encoded by the nucleic acid molecule according to any of claims 1 to 6. 

10. A DolYpeptlde according to claim 9 having an amino acid sequence of any of Sequence ID Nunibers 4, 7, 12. 14, 
17. 19, 22, 24, 30, 32 to 34. 36. 38. 40. 42, 44, 46, 48, 50, 52. 54. 56. 58, 60. 62. 64, 66, 68. 70 and 72. 

II. An expression vector comprising a nucleic acid molecule according to claim 5 or 6. 

12. An expression vector according to claim 1 1 which comprises an inducible promoter. 

13. An expression vector according to claim 1 1 or12 which comprises a sequence encoding a reporter molecule. 

14. A nucleic acid molecule according to any of claims 1 to 7 for use as a medicament. 

15. Use of a nucleic acid molecule according to any of claims 1 to 7 in the preparation of a medicament for 
treating Candida albicans associated diseases. 

16. A polypeptide according to any of claims 8 or 10 for use as a medicament. 

17. Use of a polypeptide according to any of claims 8 to 10 in the preparation of a medicament for treating Candida 
albicans associated infections. 

18. A pharmaceutical composition comprising a nucleic acid molecule according to any of claims 1 to 7 or a 
polypeptide according to any of claims 8 to 10 together with a pharmaceutically acceptable earner diluent or 
excipient therefor. 

19. A Candida albicans cell comprising an induced mutation in the DNA sequence encoding the polypeptide according 

to any of claims 8 to 10. 

20. A method of identifying compounds which selectively modulate expression of polypeptides which are crucial for 
growth and survival of Candida albicans, which method comprises: 

(a) contacting a compound to be tested with one or more Candida albicans cells having a mutation in a 
nucleic acid molecule according to any of claims 1 to 6 which mutation results in overexpression or 
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underexpression of said polypeptides in addition to contacting one or more wild t^peCandida albicans cells 
with said compound, 

(b) monitoring the growth and/or activity of said mutated cell compared to said wild type; wherein 
differential growth or activity of said one or more mutated Candida cells is indicative of selective action 
of said compound on a polypeptide or another polypeptide in the same or a parallel pathway. 

21. A compound identifiable according to the method of claim 20. 

22. A compound according to claim 21 for use as a medicament. 

23. Use of a compound according to claim 21 in the preparation of a medicament for treating Candida 
albicans associated diseases. 

24. A pharmaceutical composition comprising a compound according to claim 21 together with a pharmaceutically 
acceptable carrier, diluent or excipient therefor. 

25. A method of identifying DNA sequences from a cell or organism which DNA encodes polypeptides which are 
critical for grov^h or survival of said cell or organism, which method comprises: 

(a) preparing a cDNA or genomic library from said cell or organism in a suitable expression vector which 
vector is such that it can either integrate into the genome in said cell or that it permits transcnption of 
antisense RNA from the nucleotide sequences in said cDNA or genomic library, 

(b) selecting transformants exhibiting impaired growth and determining the nucleotide sequence of the cDNA 
or genomic sequence from the library included in the vector from said transformant. 

26. A method according to claim 25 wherein said cell or organism is a yeast or filamentous fungi. 

27. A method according to claim 25 or 26 wherein said cell or organism is any of Saccharomyces 
cen/isiae, Saccharomyces pombe or Candida albicans. 

28. Plasmid pGAL1 PSiST-1 having the sequence of nucleotides Illustrated in Figure 2. 

29. Plasmid pGAL1 PNiST-1 having the sequence of nucleotides illustrated in Figure 4. 

30. An antibody capable of binding to a polypeptide according to any of claims 8 or 1 0. 

31. An oligonucleotide comprising a fragment of from 10 to 50 contiguous nucleic acid sequences of a nucleic acid 
molecule according to any of claims 1 to 7. 

32. A nucleic acid molecule encoding a polypeptide which is critical for survival and growth of the yeast Candida 
albicans, said nucleic acid molecule comprising the sequences of any of the nucleotide sequences illustrated m 

Figures 5 to 28. 

33. A polypeptide which is critical for survival and growth of the yeast Candida albicans, said polypeptide comprising 
the amino acid sequences of any of the sequences illustrated in Figures 29 to 39. 
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Hindlll 

1 AGCTTGAGTA TTCTATACTG TCACCTAAAT AGCTTGGOT AATCATOGTC 
TCC5AACTCAT AAGATATCAC ACTOCSATTTA TCGAACCCCA WAGTACCAG 

51 ATAGCTCTTT CCTGTGTGAA ATTCTTATCC GCTCACAATT CCACACAACA 
TATCGACAAA OC^CACACW TAACAATAOG OGAOTGWAA GCTOICTTOT 

101 TACGAGCCGG AAGCATAAAK3 TCTAAAOCCT GCXKnOOCTA ATGAGTGAOC 
ATCCTCGOCC TTCGTATTTC ACATrTCGGA OCCCACOGAT TACTCACTOG 

151 TAACTCACAT TAATTCCCTT OCGCTCACTC COOXTTTOC AC TCOGGAAA 
ATTGACTGTA ATTAACGCAA CGCGAlGTGAC OGGOGAAAGG TCAGCCCTTT 

201 CXTTOTCOTGC CACCTGCATT AATGAATCGG CCAACGCWCG GOGAGAGOCG 
QC3JVCAGCACG GTCGACGTAA TTACTTAGCC OGTTGCGCGC CCCTCTCCOC 

251 CTTIGCCrrAT TGGGCGCTCT TCCGCITCCT CGCTCACTGA CTCGCTQCGC 
CAAACOCATA ACCCQCGASA AOGCGAAOGA GCGMTGACT GAGCGAOOOG 

301 TOGGTCGTTC OGCTQCGGCO AGCGGTATCA GCTCACTCAA AOGCX3GTAAT 
AGCCAGCAAG CCGACGCCGC TCGCCATAGT CGAGTGAGTT TCCGCCATTA 

351 ACOGTTATCC ACAGAATCAG GOGATAACGC AGGAAAGAAC ATOTGAGCAA 
TOCCAATAOG TCTCTTAGTC CXXTATTGCG TCCTTTCTEG TACACTCGTT 

401 AAGGCCAGCA AAAGGCCAGC AACCGTAAAA AGGCCGCGTT GCTGCCCTTT 
TICCGGTCOT TTTCCGGTCC TrGGCATTTT TCCGGCGCAA CGACCGCAAA 

451 TTCCATAGGC TCCGCCCCCC TGACGAfiCAT CACAAAAATC GACGCTCAAG 
AAGCTATCCG AGGCGGGGGC ACTGCTCGTA GTCTITTTAG CTGCGAGTTC 

501 TCACAGGTCG CGAAACCCGA CAGGACTATA AAGATACCAG GCCnTCCCC 
AGTCTCCACC GCTTTGGGCT GTCCTGATAT TTCTATGGTC CGCAAAGGOG 

551 CIGGAAGCTC CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCT TACCGGA 
GACCTTCGAG GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATOOCCT 

601 TACCTGTCCG CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC 
ATGGACAGGC GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG 

651 ACGCTCTAGG TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTOGGCT 
TGCGACATCC ATAGAGTCAA GCCACATCCA OCAAGCGAGG TTCGACCCGA 

ApaLI 

701 GTOTGCACGA ACCCCTOGT: CAGCCCGACC GCTGCXSCCrT ATCCQGTAAC 
CACACGTGCT TGGGOGGCAA GTCGGGCTGG CGACGCGGAA TAGOCCATTG 

751 TATCGTCTTG AGTCCAACCC GGTAAGACAC GACTTATCOC CACTCGCAGC 
ATAGCACAAC TCAGGTTGCG CCATTCTCTG CTGAATAOCC CTGACCGTCC 

801 AGCCACTGGT AACAOGATTA GCAGAGCGAG GTATGTAGGC GGTOCTACAC 
TCGCTGACCA TTGTCCTAAT CGTCTCGCTC CATACATCCG CCACGATGIC 

851 ACTTCTTGAA CTGGTGOCCT AACTACGGCT ACACTAGAAG GACACTATTT 
TCAAGAACTT CACCACCGGA TTGATGCCGA TCTGATCTfC CTGTCATAAA 

901 GGTATCTGCG CTCTGCTGA^i GCCAGTTACX: rrCGGAAAAA GACTTQGTAC 
ccATACACcc gacacgact: CGGTCAATOC AAGCCTTTTT CTCAACCATC 
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951 CrcnCATCC GOCAAACAAA CCACCGCTC3G TAGCOOIOGT TTTTTTGTTr 
GACAACTACC COi ' mvl ' lT OCTOGCGACC ATCGCCACCA AAAAAACAAA 



1001 GCAAOCAGCA GATTACGCGC AGAAAAAAAG GATCTCAAGA AGAICCTTTG 
CGTTCCTCGT CTAATGCGCX; IVri ' lTrm CTAGAGtTCT TCTAGGAAW: 



1051 ATCTTTTCTA CQGGGTCTOA CGCTCAOTOG AACGAAAACT CAOOTTAAOQ 
TAGAAAAGAT OCCXCAGACT GCGAOTCACC TTQCTTTTGA GTGCAATTOC 



1101 GATTTTCGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 
CTAAAACCAG TACTCTAATA GTTTTTCCTA GAAGTQGATC TAOGAAAATT 



1151 ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG 
TAATTTTTAC TTCAAAATTT AGTTAGATTT CATATATACT CATTTGAACC 



1201 TCTGACAGTT ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTC 
AGACTGTCAA TGGTPACGAA PTAGTCACTC CGTGGATAGA CTCGCTAGAC 



1251 TCTATTTCGT TCATCCATAC TTOCCTGACT CCCOCTCGTG TAGATAACTA 
AGATAAAOCA AGTAGOTATC AACOGACTGA GOGQCAGCAC ATCTATTGAT 



1301 CGATACGQGA GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA 
GCTATGCXCT CCCGAATGGT AGACCGOGGT CACGACGTTA CTATGGCGCT 



1351 GACrCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG 
CTOGGTCCDGA GTOGCCGAQG TCTAAATAGT CCTTATTTGG TCOGTCGGCX: 



1401 AAOGGCCGAG CGCAGAAGTG CTCCTGCAAC TTTATCCGCC TCCATCCAGT 
TTCCCGGCTC GCGTCTTCAC CAGGACGTTG AAATAGGCGC AGGTAOGTCA 



1451 CTATTAATTC TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATACT 
GATAATTAAC AACOGCCCTT CGATCTCATT CATCAAGCGG TCAATTATCA 



1501 TTOCGCAACG TTGTTGCCAT TGCTACAGGC ATCGTGGTCT CACGCTCGTC 
AACGCGTIGC AACAACQGTA ACGATGTCCG TAGCACCACA GTGCGAGCAG 



1551 GTTTGGTATG GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA 
CAAACCATAC CGAAGTAAG7 CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT 



1601 CATGATCCCC CATXyiTGTGC AAAAAAGCGG TTAGCTCCTT CX3GTCCTCCG 
GTACTAGGGG GTACAACACG TTTTTTCGCC AATCGAGGAA GCCAGGAGOC 



1651 ATCGTTGTCA GAAGTAACTT GGCCGCACTG TTATCACTCA TGGTTATGOC 
TAGCAACAGT CTTCATTCAA CCGGCGTCAC AATAGTGAGT ACCAATACGG 



nOl AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCITTTCTO 
TCGTCACGTA TTAAGAGAAT GACAGTACGG TAGGCATTCT ACGAAAAGAC 



1751 TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGOGA 
ACTGACCACT CATXiy^GTTGC TTCAGTAAGA CTCTTATCAC ATACGCCGCT 



1801 CCGAGTTCCT CTTGCCCGGC GTCAATACGG GATAATACCG COCCACATAG 
GGCTCAACGA GAACGGGCCG CAGTTATGCC CTATTATGGC OCQGTCTATC 



1851 CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGC GAAAAC 
GTCTTGAAAT TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCTXTIG 
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ApaLZ 

1901 TCPCAAGGAT CTTACCGCTC TTGAGATCCA OTTCGATCTA ACCCACTOST 
AGAGXTCCTA GAATGGCGAC AACTCTAGGT CAAGCTACAT TGOCTGAGCA 



ApaLl 

1951 GCACCCAACT GATCTTCACC ATCTITTACT TTCWXAOCC TTTCTGC3GT0 
CGTGGGTTGA CTAGAAGTCG TAGAAAAKSA AAGTGGTCGC AAAGACCCAC 



2001 AGCAAAAACA GGAAGGCAAA ATGCCXX:AAA AAAGOGAATA AGGGC GACAC 
'iWlTlT T GT CCTTCCGTTT TACGGCGTTT TTTCOCTTAT TCCCGCTCTG 



2051 OGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA TT GAAGCA TT 
CCTTTACAAC TTATGAGTA? GAGAAGCSAAA AAOITATAAT AACTTCGTAA 



2101 TATCAOGGTT ATTOTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA 
ATAOTOXJVA TAACAGAGTA CTCGCCTATC TATAAACTTA CATAAATCTT 



2151 AAATAAACAA ATAGGCX3TTC COCGCACATT TCCCC GAAAA GTGCCACCTG 
TTTATTTOTT TATCCCCAAC5 GCGCOTGTAA AGOGGCTITr CACOOTOGW: 



2201 ACGTCTAAGA AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT 
TGCAGATTCT TTGGTAATAA TAGTACTGTA ATTGGATATT TTTATCCGCA 



2251 ATCACGAOOC CCTTTCGTCT CGOGCX?rTTC GGTGATGACC GTGAAAACCT 
TAGTGCTCCG GGAAAGCAGA GCX5CGCAAAG CCACTACTGC CACTrTTGGA 



2301 CTGACACATG CAGCTCCCGG AGACGGTCAC ACCTTGrTCTG TAAGCGGATG 
GACTGTGTAC GTCGAOOGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC 



2351 CCGGGAGCAG ACAAQCCCG7 CAOGGOGCGT CAGCGGC3TGT TGGCQGGTCT 
GGCCCTCGTC TCTTCOGGCA GTCCCGCGCA GTCGCCCACA ACCGCCCACA 



ApaLI 



2401 CGGGGCTGGC TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA 
GCCCCGACCX5 AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT 



ApaLI 

2451 CCATATGCGG TGTGAAATAC CGCACAGATC CGTAAGGAGA AAATACCGCA 
GGTATACGCC ACACTTTATG GCGTCTCTAC GCATTCCTCT TTTATOGCGT 



2501 TCAOGCGAAA TTGTAAACG7 TAATATTTTO TTAAAATTCG CGTTAAATAT 
AGTCCGCTTT AACATTTGCA ATTATAAAAC AATTTTAAGC GCAATTTATA 



2551 TTGTTAAATC AGCTCATTTT TTAACCAATA GGCCGAAATC GG CAAAA TCC 
AACAATTTAG TCGACrTAAAA AATTOGTTAT CCOGCTTTAC CCGTTTTAGG 



2601 CTTATAAATC AAAAGAATAG ACCGAGATAG GCTTGACTGT TGTTCCAGTr 
GAATATTTAG TTTTCTTATC TGGCTCTATC CCAACTCACA ACAAGGTCAA 



2651 TGGAACAAGA GTCCACTATT AAAGAACGTG GACTCCAACG TCAAAGGGCG 
ACCTTCTTCT CAQGTGATAA TrTCTTGCAC CTGAGGTTCC ACTTTCCCGC 



2701 AAAAACCGTC TATCAGGGCG ATGGCCCACT ACGTGAACCA TCACC CXAAT 
TTTTTGGCAG ATAGTCCCGC TACCGGGTGA TGCACTTGGT ACTOGGTTTA 



2751 CAAGTTTTTT GCGGTCGAGG TGCCGTAAAG CTCTAAATCG GAACCCTAAA 
GTTCAAAAAA CGCCAGCTCC ACGGCATTTC GAGATTTAOC CTPOGGATTT 
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3801 OOGkGCCCCC CATTTAGAGC TTGACGGGGA AAGCCGGCGA ACXTTOGCGAC 
CCCTCGG G GG CTAAATCTCC AACTOCCCCT TTCOGCCGCT TGCACCOCTC 



2851 AAAOGAACCG AACJVAACCGA AACGACCGOG CGCTACOGCC CTO GCAAC TC 
mCCTTCCC TrCTTTCGCT TTCCTCGCCC GCGATCCCOC GACCGTTCAC 



2901 TAGCGGTCAC GCTGCC3CGTA ACCACCACAC CCGCCGCGCT TAA TGCGOOG 
ATCOCCACTG CGACOCOCAT TGGTGGTGtG CCCGCCGCGk ATTAOGOGOC 



2951 CTACAGGGCG CGTCCATTCG CCATTCAGGC TGCX3CAACTG TTGGGAAOOO 
GATCTCCCGC GCAGGTAAGC GGTAAGTCCG ACGCGTTGAC AACCCTTCCC 



3001 CGATCX3CTCC GGGCCTCTTC GCTATTACGC CAOCTOGCGA AAOGGOGATO 
GCTAGCCACG CCCGGAGAAG CGATAATQCG CJTCGACCGCT TTCCOCCTAC 



3051 TGCTGCAAGC CGATTAAGTT GGCSTAACOCC AGOGTTTXC CACT CACGAC 
ACGACGTTCC GCTAAWCAA CCCATTGCGG TCOCAAAAOG GTCAOTQCTG 



3101 GTTGTAAAAC GACGGCCAGT GAATTGTAAT ACGACTCACT ATAGGGCGAA 
CAACATTTTO CTGCCGGTCA CTTAACATTA TQCTGACatSA TATCCCOCTT 



3151 TTOGTmCC AATGATGAOC ACTPrTAAAC TTCTGCTATG TGGOGCGGTA 
AACCAAAAOG TTACTACTCG TGAAAATTTC AAGAOtSATAC AOCGCGOCAT 



3201 TTATCCCGTG TTGACGCCGC GCAAGAGCAA CTCGGTCGCC GCATACACTA 
AATAGOGCAC AACTOCGGCC CGTTCTCGTT GAGCCAGCGG CGTATGTGAT 



3251 TTCTCAGAAT GACTTGGTTG AGTACTAATA GGAATTCATT TGGATGGTAT 
AAGAGTCTTA CTCAACCAAC TCATGATTAT CCTTAACTAA ACCTACCATA 



3301 AAACGGAAAC AAAAAAAAGA GCTGCTACTA CTTTCTTTAA AATTATTTTA 

rnGCcmG ttttttttct cgaccatgat gaaagaaatt ttaataaaat 



3351 TTATTTGATT TTATTTAATA GTATATATTA TATTTT GAAC GTAGATTATT 
AATAAACTAA AATAAATTAT CATATATAAT ATAAAACTTG CATCTAATAA 



3401 rTGTOAAAG TTGCTGTAGT GCCATTGATT CGTAACACTA ATTCTGTATT 
AACAACTTTC AACGACATCA CGGTAACTAA GCATTGTGAT TAAGACATAA 



3451 AGTCATTCC? CTTGTTTGAT AGTATCCAAA AAAACQGCTA TTTTTTTQCA 
TCAGTAAGGA GAACAAACTA TCATAGGTTT TTTTGCCGAT AAAAAAACCTT 



3501 ATCTTATTTC CTGCATATTA TACAGATAAC ATAATGAAAG AAAAAATCTT 
TAGAATAAAG GACGTATAAT ATCTCTATTO TATTACTTTC TTTTTTAGAA 



3551 TTmrrGTr cttcaatgat gatttcaacc attcttttaa acatigatca 

AAAAAAACAA GAAGTTACTA CTAAAGTTGG TAACAAAATT TCTAACTAGT 



3601 ATtCCTCAGC AACAACCCCA TACACACIGG TTTATATACC QCCCCTTTTA 
TAACGACTCG TTGTT G GGGT ATCTGTGACC AAATATATGC COOGGAAAAT 



3651 CAGTTGAAGA AAGAAATAGA AATAGAAATA GCAAACAAAA GATAT GACAC 
GTCAACTTCT TTCTTTATCT TTATCTTTAT CXilTlOTTTT CTATACPGTC 



3701 TCAACACTAA GACCTATAG? GAGAGAGCAG AAACTCATGC CTCACCAGTA 
AGTTGT G ATT CTGGATATCA CTCTCTCGTC TTTCAOTACC GAGTOGTCAT 



3751 GCACAGCGAT TATTTCGATT AATGGAACTG AAGAAAACCA ATTTATCTGC 
CGTGTCGCTA ATAAAGCTAA TTACCTPGAC TTCTTTTOGT TAAATACACG 
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EcoRI 



3801 ATCAATTCAC GTTGATACCA CTAAGGAATT CCTTGAATTA ATTGATAAAT 
TACTTAACTG CAACTATOCT GATTCCTTAA OGAACTTAAT TAACTATTTA 



3851 TAGGTCCTTA TGTATGCTTA ATCAAGACTC ATATTGATAT AATCAATGAT 
ATCCAGGAAT ACATACGAAT TAGTTCTGAG TATAACTATA TTAOTTACTA 



3901 TTTTCCTATG AATCCACTAT TGAACCATTA TT AGAAC TTT CAC CTAAACA 
AAAAGGATAC TTAGGTGATA ACTTGGTAAT AATCTTGAAA GTGCATTTGT 



3951 TCAATTTATG ATTTTTGAAG ATAGAAAATT TGCTGATATT GGTAATACOC 
AGTTAAATAC TAAAAACTTC TATCTTTTAA ACGACTATAA CCATTATGOC 



4001 TAAAGAAACA ATATATTGGT GGAGTTTATA AAATTAOTAO tTGOCCAGAT 
ATTTCTTIGT TATATAACCA CCTCAAATAT TTTAATCATC AACCCGTCTA 



4051 ATTACCAATG CTCATOCTGT CACTGGGAAT GGAGTCCTTG AAGGATTAAA 
TAATGGTTAC GAGTACCACA GTGACCCTTA CCTCACCAAC TTCCTAATTT 



4101 ACAGC3GW3CT AAWSAAACCA CCACCAACCA AGAGOCAAGA OQGTTATTGA 
TGTCCCTCQA TTTCTTTGGT GGTGCSTPGGT ICTCGGT T CT CCCAATAACT 



4151 TXTTTAGCTCA ATTATCATCA GTGGGATCAT TAGCATATGG AGAATATTCT 
ACAATCGACT TAATACTAGT CACCCTAGTA ATCGTATACC TCTTATAAGA 



4201 CAAAAAACTG TTGAAATTGC TAAATCCGAT AAGGAATTTG TTATTGGATT 
CTTTTTPGAC AACTTTAACC ATTTAOGCTA TtCCPTAAAC AATAACCTAA 



4251 TATTGCCCAA CGTGATATGG GTGGCCAAGA A GAAG GATTT GATTGGCtTA 
ATAACGGGTT GCACTATACC CACCGGTTCT TCTTCCTAAA CTAACCGAAT 



4301 TTATGACACC TGGAGrTTGGA TTAGATGATA AAQGTGATGG ATTAO GACAA 
AATACTGTGO ACCTCAACCT AATCTACTAT TTCCACTACC TAATCCTOTT 



4351 CAATATAGAA CTOTPGATGA ACTPGTTAGC ACTGGAACTG ATATTATCAT 
GTTATATCTT GACAACTACT TCAACAATCG OXUCCPTGAC TATAATAGTA 



4401 TGTTGGTAGA GGATTGTTTC GTAAAGGAAG AGATCCAGAT ATT GAAGG TA 
ACAACCATCT CCTAACAAAC CATTTCCTIC TCTAOGTCTA TAACTTCCAT 



4451 AAAOGTATAG AAATGCTGGT TOGAATOCTT ATTT GAAAAA GACTOGCCAA 
TTTCCATATC TTTACGACCA ACCTTACGAA TAAACTTTTT CTGACCOGTT 



4501 TTATAAATGT GAAGGGGGAG ATTTTCACTT TATTASATTT GTATATATGT 
AATATTTACA CTTCCCCCTC TAAAACTGAA ATAATCTAAA CATATATACA 



4551 AGAATAAATA AATAAATAAG TTAAATAAAT AATTAAATAA GOGTGGTAAT 
TCTTATTTAT TTATTTATTC AATTTATTTA TTAATTTATT CCCACCATTA 



4601 TATTACTATT TACAATCAAA GGTGGTCCTT CTAGCTGTAA TCCOGGCAOC 
ATAATGATAA ATGTTAGTTT CCACCAOGAA GATCGACATT AGGCCCGTOG 



4651 GCAACGGAAC ATTCATCAGT GTAAAAATGG AATCAATAAA GCCCPGCGCA 
CGTTGCCTTG TAAGTAGTCA CATTTTTACC TTAGTTATTT COGGACGCGT 



4701 GCGCGCAGGG TCAGCCTGAA TACOCCTTTA ATGACCAGCA CACTCGIGAT 
COCOCCTCCC AGTCGGACTT ATCCGCAAAT TACTGOTCGT GTCAGCACTA 
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4751 OOCAAGCTCA CAATAOCCCA AGTCOGOCGA GGGGCCTOTA CAG roAOgSA 
CCOnCCAGT CTTATCQGGT TCAGCCGGCT CCCCGGACAT GTCACTCCCT 



4801 AGATCTOATA TTCACGAAGA GGAACCAATO TAAOCrTTACA CTQAAGAAAA 
TCTAGACTAT AACTOCTTCT C CTIG GTT AC ATPQCAATGT GACTTCTTTT 



4851 CACACAATAA ACGOGAAGAA ACC5GTCTAAA AOTOTGAAAA TAATTTTTOA 
CTGTGTTATP U ^ CX;CX''Iirn' TGCXACATTT TCACACTPIT ATTAAAAACT 



4901 ATATCATTTC CCTTOCTTTA ATTCCAAACG AAACGTGTTT rmTAGAGA 
TATAOTAAAG GGAACCAAAT TAAGGrTTGC TTTCCACAAA AAAAATCTCT 



EcoRI 



4951 AIXSGGAATTC TTATTGGATC TCTAGATTGT TTGTTTACTC CAGACTGTGC 
TACCCTTAAG AATAACCTAC AGATCTAACA AACAAATGAG GTCTGACACG 



ApaLX 

5001 ACAAAAACGT TTGGATGGAT GATCAGAAGA TATTTTTAOG CTTAOCTCTA 
roTTTTTGCA AACCTACCTA CTAGTCTTCT ATAAAAATCC GAATCGAGAT 



5051 AATATAAGAA ATGATGCTTG AAAAACCAGA CAGAAATTGA i/i A i vAAAAA 
TTATATTCTT TACTACGAAC ITI T T US T CT GTCTTTAACT CAAAGTTTTT 



5101 TTGOTAATGT GAGGTATTAG TCAACTAACC AAATAACAAT GCAAACOGOT 
AACCATTACA CTCCATAATC AGTTGATTGO TTTATrOTTA CGTTTOGCCA 



5151 TCATACATTT CATTTTGAAA ATAATGAAAC TQGAATTGGA TGACCAGCAC 
ACTATGTAAA CTAAAACTTT TATTACTTIG ACCTTAACCT ACTOGTCGTG 



5201 ACAAACACAT AAAGTAATTA TGGGAATTAG AAGC GAACAT AGAGG^AC 
TGTTTGTGTA TTTCATTAAT ACCCTTAATC TTCGCTTOTA TCTCCTCATG 



5251 TTGGCCACGA ACAGAATACA ACTGGGAACA CTArTTTCTC CATTGTTTTA 
AACCGCTGCT TGTCTTATGT TCACCCTTGT GATAAAAGAG GTAACAAAAT 



5301 G rK TC TTl'T TTTGTCAGCC TAGTTTTGTG CTATGTGTAA AAAATATTOC 
CAAGACAAAA AAACACTCGG ATCAAAACAC GATACACATT TTTTATAACO 



Hindlll 



5351 CAAGAAAAAA AGCTTCTTTT GTGGCCACTO TCCGAAAAAA ATTTTGOOGA 
Gl -X I TTTTT TCGAACAAAA CACCGGTCAC AGOCTTTTTT TAAAACCCCT 



5401 ATCTTCGGAT TAATTTATGT TTTCATTCCA TCGGGCAAAG TGGGGOGGAA 
TAGAAGCCTA ATTAAATACA AAAGTAAGGT AGCCCCTTTC ACCOCCCCTT 



5451 AAAATTTTAA GCAGTTCACA AAACCTTCCA AAAAATATAT GGACAAACA T 
TTTTAAAATT CGTCAAGTGT rTTGCAAOGT TTTTTATATA CCTffrTTCTA 



5501 GATTCTATTT TCCCGACACC AAAATCATAA TTAATTATGA GAAACTTAAA 
CTAACATAAA AGGGCTCTOG TTTTAGTATT AATTAATACT CTTTCAATTT 



5551 TGTAACGTTA CAATTTATGr TTATTTGAAG GTGAAAAGCG ATTTATGATT 
ACATTGCAAT GTTAAATACA AATAAACTTC CACTTTTCOC TAAATACTAA 



5601 TTTCCGAAAT GAAAATTTTT TTTAOCTTTA TTTTTTrTCT CGGG CAAAGA 
AAAGGCTTTA CTTTTAAAAA AAATCCAAAT AAAAAAAACA GCCCGTTTCT 
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EcoRI 

5651 AAAACTCAAC AAGGATTATT AAAATTTTTC G Wm X S TTT GTCTCTGGAG 
rriTCACTTG TTCCTAATAA TTTTAAAAAC CACAAACAM CACAGACCTC 



EcoRI 



5701 AATTCATTCC TCTCTCATCT TCACACAATG TTTA GACAT C TGACAOGAT T 
TTAAGTAAGG AGAGAGTAGA AGTGTCTTAC AAATCTGTAG ACTGTGCTAA 



5751 CATtSATAGTT CGGTTTCCGG G G TTCGTG T T TAGTTTTCGT TTTTCTTTTt 
GTACTATCAA GCXyUAQOCC CCAACCACAA ATCAAAAGCA AAAAGAAAAA 



5801 rrTTOGAAAG AATGTTTTAG CTCATTC3GTT TTCTTTCTTC ATTCAATAGT 
AAAACCTTTC TTACAAAATC GAGTAACCAA AAG^AAGAAG TAAOTTATCA 



5851 nTCAAAGAA TTTCCCCACT TGTTATTACA ATCATATAAA AT TAAAC TTT 
AAACTTTCTT AAACGOGTGA ACAATAATGT TAGTATATTT TAATTXGAAA 



5901 GATATAAAAT AGAGTITGAA AGTrXCCCAG ATCCTTTTTG ATTTCTTrGT 
CTATATTTTA TCTCAAACTT TCAAAGGGTC TAGCSAAAAAC TAAAGAAACA 



5951 AAATTTTTTT TTCTCCCACA TATACACACA TA CAAAC CGA TTTTTATAAG 
TTTAAAAAAA AAGAOGGTGT ATATGTGTOT ATGTrtGGCT AAAAATATTC 



PstI Aval BaotMI 



6001 AAAGAGTTAT ACCX:TQCAGC TCGACCTCGA GGC3ATCCGGG CCCTCTAfiAT 
TTTCTCAATA TGGGACGTCG AGCTGGAGCT CCCTAGGCCC OGGAGATCTA 



Aval 



6051 GCOGOCGCTA GGCCTCGAGG GACTTTTGCA CCAAAAATAA TTTATmCC 
CGCCGGCGAT CCGGAGCTCC CIGAAAACGT GGTTTTTATT AAATAAAAGG 



6101 AAAATAAAAT TTAAATAAAT AAAAATAACT CATAATTTAA TAAAAATTTC 
TTTTATTTTA AATTTATTTA TTTTTATTGA GTATTAAATT ATTTTTAAAG 



6151 AAAATCTTCT AGTGTCCTTT CATATGCAGT ACATTAGCCA TCAGTCACTT 
TTTTAGAAGA TCACAGGAAA GTATACGTCA TGTAATCGGT AGTCAGTGAA 



6201 AAACAOCATC TGCTGGTTGA AGAATGCTTG AAGCAATTGT CCAGTCCCAG 
TI TC TCCTAG ACGACCAACT TCTTACGAAC TTCGTTAACA GGTCAGGCXC 



6251 AGGCACAGOC TAOGAGATCT TCAGTTTCGG AGGTAACCTG TAAGTCTGTT 
TCCGTGTCCX; ATCCTCTAGA AGTCAAAGCC TCCATTOGAC ATTCAGACAA 



6301 AATGAAGTAA AAGTTCCTTA GGATTTCCAC TCTCACTATG GTCCAGGCAC 
TTACTrCATT TTCAAGGAAT CC7AAAGGTG AGACTGATAC CAGGTCCGTC 



6351 AGTGACTOTA CTCCTTOGCC TTCAGGTAAT GCAGAATCCT CCCATAATAT 
TCACTCACAT GAGGAACCGG AAGTCCATTA CGTCTTAGGA GGGTATTATA 



6401 CTTTTCAGGT GCACACTGCT CATGAGTTTT CCCCXGCTGA AATCTTCTTT 
GAAAAGTCCA CCTCTGACGA TTACTCAAAA GGGGACCACT TTACAAGAAA 



6451 CTCCAGTTTT TCTTCCAOGA CTGTCTTCAG ATGGTTTATC TCATGATAGA 
GAGGTCAAAA ACAAGGTCCT 3ACAGAACTC TACCAAATAG ACTACTATCT 



6501 CATTAGCCAG GAGGTTCTCA ACAATAGTCT CATTCCAGCC AGTGCTAGAT 
GTAATCOGTC CTCCAAGAGT TOTTATCAGA GTAAOC?PCGG TCACGATCTA 
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6551 GAATCTTCTC TGAAAATAOC AAAGATGXTC TGGAOCATCT CATAGAJICOT 
CTTAGAACAG ACTTTTATCG TTTCTACAAC ACCTOGTAGA OTATCTACCA 



PstI 



6601 CAATGCGGCG IX. r i'CCTTCT OGAACTGCTC CAOCTGCTTA ATCTCCTCAC 
GTTACGCCGC AGGAGGAAGA CCTTGACGAC GTCGACGAAT TAGAGGAGTC 



6651 OGATGTCAAA GTTCATCCTC TCCTTGAGOC AGTATPCAAG CCTCCCATTC 
CCTACAOTTT CAAGTACCAC AOGAACTCCG TCATAAMTC GGACGGTAAO 



6701 AATIGCCACA GGAGCTTCTG ACACTGAAAA TTGCTGCTTC TTTGTAOGAA 
TTAACGGTGT CCTCGAAGAC TGTGACTTTT AACX5ACGAAG AAACATCCTT 



6751 TCCAAGCAAG TTGTAGCTCA TGGAAAGAGC TGTAGTGQAG AAG CACAACA 
AQCTTCGWC AACATCGACT ACCTTTCTCC ACATCACCTC TTCGTOTTOT 



Aval 



6801 GGAGAGCAAT TTOGAGGAGA CACTTOPTQG TCATGTTCCT C GAOGCC TTT 
CCTCTCGTTA AACCTCCTCT CTGAACAACC AGTACAAOGA GCTOCOGAAA 



BasMZ 

6851 TTOGCCAGCT GGCGCCTGCT GCGCGACGGC GAGCTOCTCA CCACCCAGGA 
AACCGGTCGA CCGCGGACGA CGCGCTGCCG CTCGACGAGT OGTGGGTCCT 



BainHI 

6901 TCCGTCCCCC TTTTCCmG TCGATATCAT GTAATTACTT ATGTCACGCT 
AOGCAOGGGG AAAAOGAAAC AGCTATAGTA CATTAATCAA TACAGTOCXSA 



6951 TACATTCACG CCCTCCCCCC ACATCCGCTC TAACC3GAAAA OGAAOGAG TT 
ATGTAAGTGC GGGAGGGGGG TGTAOCCGAC ATIGOCTTTT CCTTOCTCAA 



7001 AGACAACCTG AAGTCTAGG7 CCCTATTTAT TTmTATAG TTATOTrAGT 
TCTCTTGGAC TTCAGATCCA GOGATAAATA AAAAAATATC AATACAATCA 



7051 ATTAACAACG TTATTTATA7 TTCAAATTTT TCmTWTT CTGTACAGAC 
TAATTCTTGC AATAAATATA AAGTTTAAAA AGAAAAAAAA GACATCTCTG 



7101 GCOTCTACGC ATGTAACATT ATACTGAAAA CCTPGCrWA GAAOCTTrTG 
CX3CACAT0CG TACATTGTAA TATGACTTTT GGAACGAACT CTTCCAAAAC 



Hindi II 

7151 GGACGCTCGA AGGCTTTAAT TTGCA 
CCTGCGAGCT TCCGAAATTA AACGT 
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1 TTCCATCGGG GAAAGTGGGC GOGAAAAAAT TTTAAGCAGT TCACAAAACC 
AAOGTAOCCC CTTTCACCCC CCCmTTTA AAATTCGTCA AGTCmTTOG 



51 rrCCAAAAAA TATATCGXCA AACATGATTG TATTTTCCCG ACACCAAAAT 
AAOGTTTTTT ATATACCTGT TTCTACTAAC ATAAAACGGC TCTGGTTTTA 



101 CATAATTAAT TATGAGAAAG TTAAATCTAA CXnTACAATT TATGTrPATT 
CTATTAATTA ATACTCTTTC AATTTACATT GCAATCTTAA ATACAAATAA 



151 TCAAGGTGAA AAOCGATTTA TGATTTTTCC GAAA T GAAAA TTTTTTTTAC 
ACTTCCACTT TTCGCTAXAT ACTAAAAAGG CTTTACTTTT AAAAAAAATC 



201 GTTTATTTTT TTTGTCGGGC AAAGAAAAAC TGAACAAOGA TTATTAAAAT 
CAAATAAAAA AAACAGOCCG Vnvmvm ACTTGTTCCT AATAATTTTA 



EcoRI 



251 r mw roTT l OTn OI GTC tqgagaattc attcctctct catct tcaca 

AAAACCACAA ACAAACACAC ACCTCTTAAG TAAOGAGAGA GTAGAAOTGT 



301 CAATCTTTAG ACATCTGACA CGATTCATGA TACTTCGQTT TCCGOOGTTG 
GWACAAATC TCTAGACTGT GCTAAGTACT ATCAAGCCAA AOOCCCCAAC 



351 GTCTTTAGIT TlWriTnL TTnrnTTG QAAAGAATGT TPTAGCTCAT 
CACAAATCAA AAGCAAAAAG AAAAAAAAAC CTTTCTTACA AAATCGAGTA 



401 TGGTTTTCTT TCTTCATTCA ATAOTTTTGA AAGAATTTGC CCACTTGTTA 
ACCAAAAGAA AGAAOTAAOT TATCAAAACT TTCTTAAACG GGTGAACAAT 



451 TTACAATCAT ATAAAATTAA ACTTTGATAT AAAATAGAG T TTCAAACTTT 
AATGTTAGTA TATTTTAATT TGAAACTATA TTTTATCTCA AACTTTCAAA 



501 CCCAGATCCT TTTTGATTTC TTTGTAAATT TTTTTTTCTC CCACATATAC 
GGGTCTAGGA AAAACTAAAG AAACATTTAA AAAAAAAGAG GGTGTATATG 



PsCi 



551 ACACATACAA ACCGATTTTT ATAAGAAAGA GTTATACCCT GCAGCTCGAC 
TGTCTATGTT TCGCTAAAAA TATTCTTTCT CAATATGGGA CGTCGAOCTG 



Pst: HindIZZ Aval 



601 CTCGACTGTT TAAACCTGCA OGCATOCAAG CTTGGOCAAA AAGGCCTCGA 
GAOCTGACAA ATTTOGACGT CCGTACGTTC GAACCGGTTT TTCCOGAGCT 



Aval 

S51 OGAACATGAC CAACAAGTGT CTCCTCCAAA TTGCTCTCCT GTTGTGCTTC 
CCTTGTACTG GTTGTTCACA GAGGAGGTTT AACGAGAGGA CAACACGAAG 



701 TCCACTACAG CTCTTTCCAr CAGCTACAAC TTGCTTGGAT TCCT ACAAAG 
AGGTCATCTC CAGAAAGGTA CTCGATCTTG AACGAACXTTA AGGATGTTTC 



751 AAGCAGCAAT TTTCAGTGTC AGAAGCTCCT GTGGCAATTG AATOGGAOGC 
TTCGTCGTTA AAAGTCACAG TCTTCGAOGA CACCGTTAAC TTACTCTCCG 



eOl TTGAATACTG CCTCAAGGAC AGCATGAACT TTCACATCCC TGAGGAGATT 
'aacttatgac GGAGTTCCTO TCCTACTTGA AACTGTAGGG ACTCCTCTAA 
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PstI 



851 AAGCAGCrCC ACCAGTICCA GAAGGAGGAC CCCGCATTGA CCATCTAXW 
TTCOTCGACG TCGTCAAGGT CTTCCTCCTG CGGCOTAACT GGTAGATACT 



901 GATGCTCCAG AACATCTTTC CTATTTICAG A CAAGAT tCA TCTAGCACXQ 
CTACGAGGTC TTGTAC5AAAC GATAAAAGTC TCTPCTAAOT A0ATCG1GAC 



951 GCTOGAATGA GACTATTGTT GAGAACCTCC TGCJCTAAirGT CTAffCATCAC 
OGACCTTACT CTGATAACAA CTCTTGGAGO ACOOATTACA QATAOTAGIC 



1001 ATAAACCATC TGAAGACAGT CCTGGAAGAA AAACTOGAGA AAGAAGAT TT 
TATTTGGTAG ACTTCT G TOV OGACCTTCTT TTTGAOCTCT TTCTTCTAAA 



1051 CACCAGCXX5A AAACTCATCA GCAGTCTOCA CCtGAAAAGA TATTATGGGA 
GTC3GTC0CCT TTTGAGTACT CGTCAGACOT OGACTTITCT ATAATACCCT 



UOl GGATTCTOCA TTACCTGAAG GCCAAOOAGT ACAOTCACTO TGCC TOQAOC 
CCTAAGACGT AATQGACTTC CGGTTCCTCA TGTCAGriGAC ACQC3A0CTQG 



1151 ATAGTCAGAG TQGAAATCCT AAGC»ACTTT TACTTCATTA ACAGACTTAC 
TAOXSWTCTC ACCTTTAGGA TTCCTTGAAA ATGAAGTAAT TGICTGAATC 



1201 AGGTTACCTC CGAAACTGAA GATCTCCTAC CCWPOCrTC TCGGACTGGA 
TCCAATGGAG GCTTTGACTT CTAGAGGATC GGACACGGAG ACCCTGACCT 



1251 CAATTGCTTC AAGCATTCTT CAACCAGCAG ATGCTOTPTA AGTGACTGAT 
GTTAACGAAG TTCGTAAGAA GTTGGTCGTC TACGACAAAT TCACTGACTA 



1301 GGCTAATCTA CTGCATATGA AAGGACA C TA CAAGATTTTG AAArTTTTAT 
CCGATTACAT GACGTATACT TTCCTGTGAT CTTCTAAAAC TTTAAAAATA 



1351 TAAATTATGA CTTATTTTTA TTTATTTAAA TTTrATTTTG G AAAA TAAAT 
ATTTAATACT CAATAAAAAT AAATAAATTT AAAATAAAAC CTTTTATTTA 



Xmal 
Smal 
BaznHI 



Aval Aval 



1401 TATTTTTGGT GCAAAAGTCC CTCGAGGCCT AGCGGCCXXrC TAGAGGATCC 
ATAAAAACCA CGTTTTCAGC GAGCTCCGGA TCGCCGGCGG ATCTCCTAGG 



XznaZ 



Smal 



Aval 



1451 CCGGQCGCTA GGCGGCCGCT AQGCCTTTTT GGCCAAOCTC GAArtTCGAG 
GGCCCGCGAT CCGCCGGCGA TCCGGAAAAA CCOGTTOSAG CTTAAAGCTC 



Xmal 



Sr-al 



EcoRI Ava: Clal 



1501 GAATTCGACC TCGGTACCCG GGGGATCCAT CCGTCCCCCT TTTCCTTTGT 
CTTAAGCTCG AGCCATGGGC CCCCTAGCTA GGCAGGGGGA AAAGGAAACA 
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1551 CGATATCATC TAATTAGTTA TCTCACXXTTT ACATTCACGC CCTCCCCCCA 
GCTATAGTAC ATTAAtCAAT ACAGTGCGAA TGTAAGTOCC GGAGGOGGGT 



1601 CATCCGCTCr AACCGAAAAG GAACGACTTA GACAACCTGA ACTCTAGGTC 
GTAGGCGAGA TTGGCTrTTC CTTCCTCAAT CTCTTGGACT TCAGATCCAG 



1651 CCTATTTATT TTTTTATAGT TATCTTAGTA TTAAGAACGT TATTTATATT 
GGATAAATAA AAAAATATCA ATACAATCAT AAWTO3CA ATAAATATAA 



1701 TCAAATTTTT CTTTTTTTTC TGTACAGACG CXTPCTACGCA TGT AACA TTA 
AGTTTAAAAA GAAAAAAAAC ACATOTCTGC GCACATCCGT ACATTGTAAT 



17S1 TACTCAAAAC CTTGCTTGAG AAG G 'I Tf TGG GACGCTCGAA OGCTTTAATT 
ATGACTTTTC GAACGAACTC TTCCAAAACC CTOCQAGCTT GOGAAATTAA 



1801 TXXIAAGCTAG CTTGCCOTAA TCATGOTCAT AGCTGTTTCC TGTGT GAAAT 
ACGTTCGATC GAACCX3CATT AGTACCAGTA TCGACAAAGO ACACACTTTA 



1851 TGTTATCCCSC TCACAATTCC ACACAACATA CGAGCCGGAA GCA TAAAG TC 
ACAATAGGCG AGTCTTAAGG TGTGTTGTAT GCTCOGCCTT CGTATTTCAC 



1901 TAAAGCCTOG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGOCSTTOC 
ATTTCGGACC CCACGGATTA CTCACTCGAT TGAGTGTAAT TAACGCAACG 



1951 GCTCACTGCC CGCTTTCCAG TCGGGAAACC TCTCGTGCCA GAGA TCTCTG 
CGAGTGACGG GCGAAAGGTC AGCCCTTTGG ACAGCACGGT CTCTAGAGAC 



2001 CATTAATGAA TCGGCCAACG CGCGGGGAGA GGCGGTTTGC GTATTQOOOC 
GTAATTACTT AGCCGGTTGC GCGCCCCTCT CCGCCAAAOG CATAACCCGC 



2051 CTCTTCCGCT TCCTCGCTCA CTGACTCGCT GCGCTCOGTC GTTCGOCTOC 
CAGAAGGCGA AGGAGCGAGT GACTGAGCGA CGCGAGCCAG CAAGCCCACG 



Clal 



2101 GGCGAGCX3GT ATCAGATCGA TCTCACTCAA AGGCGOTAAT ACOGTTATCC 
CCGCTCGCCA TAGTCTAGCT AGAGTOAGTT TCCGCCATTA TGCCAATAOG 



2151 ACAGAATCAG GGGATAACCC AGGAAAGAAC ATCTGAGCAA AAGGCCAGCA 
TCTCTTACTC CCCTATTGCG TCCTTTCTTO TACACTCGTT TTCCGGTCGT 



2201 AAAOOCCAGG AACCGTAAAA AGQCCXSCGTr GCTGGCGITT TTCCATAGGC 
TTTCCGGTCC TTGGCATnT TCCGGCGCAA CGACCGCAAA AAOGTATCCG 



2251 TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAC TCAGAGGTGG 
AGGCQGOGGG ACTGCTCGTA GTG7TTTTAG CTGCGAGTTC AGTCTCCACC 



2301 CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 
GCrriGGGCT GTCCTGATAT TTCTATGGTC CGCAAAGGGG GACCTTCGAG 



2351 CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTCTCCG 
GGAGCACGCG AGAQGACAAG GCTGGGACGG CGAATCGCCT ATGGACAGGC 



2401 CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAOO 
GGAAAGAGGG AAGCCCTTCG CACCGCGAAA CAGTATCGAG TGCGACATCC 



ApaLI 



2451 TATCTCAGTT CGCTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 
ATAGAGTCAA GCCACATCCA GCAAGCGAGG TTCGACCCGA CACACGTGCT 
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2501 ACCCCCCGTT CAGCCXGACC GCTCCGCCTT ATCCGGTAAC TATCGtCTTC 
TOC3CSGGGCAA CTCGGOCTGG CGACOCGGAA TAOOCCATTG ATAOCAGAAC 

2551 AGTCCAACCC GOTAAGACAC GACTTATOGC CACTGGCAGC AGCCAC TGGT 
TCAGGTTOGG CCATTCTOTG CTGAATAOCG OTGACOOTCG TCGGTOACCA 

2601 AACAOGATTA GCAGAOCGAC GTATGTAOOC CGTQCTACAC AGTTCTTGAA 
TTOHXTAAT CGTCTCGCTC CATACATCCC CCACGATOIC TX»AGAACTT 

2651 GTCGTGGCCT AACTACOGCT ACACTAGAAC GACAGTATTT GCTATCTOCG 
CACCACCGGA TTGATGCCGA TGTGATCTTC CTGTCATAAA CCATAGAOOC 

2701 CTCTCCT G AA GCCAGTTACC TTCGGAAAAA GACTOGTAG CTCTTGATCC 
GAGACGACTT CGGTCAATQG AAGCCTTTTT CICAACCATC GAGAACTAOG 

2751 GGCAAACAAA CCACCGCtOG TAGCGGTGGT TrTTTTCTTT GCAACCAGCA 
CCGTTTOrrr GGKWCGACC ATCOCCACCA AAAAAAGAAA CCTTCGTCCr 

2801 GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTO ATCnTPCTA 

ctaatgcgcg Tcmrmc ctagagttct tctaggaaac tagaaaagat 

2851 CQGGGTCPGA CXXTTCACTGC AACXyVAAACT CACGTTAAGG GATTTTGCTrC 

gccocagact gcgagtcacc ttocttttga gtqcaattcc ctaaaaccag 

2901 atgagattat caaaaaggat cttcacctag atccttttaa at taaaaat g 
tactctaata gtttttccta gaagtggatc taggaaaatt taatttttac 

2951 aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacactt 
ttcaaaattt agttagattt catatatact catttgaacc agactoxcaa 

3001 accaatgctt aatcactgag gcaicxtatct cagcgatctc tctatttogt 
tcgttacgaa tt agtcactc cgtggataga gtcgctagac agataaaoca 

3051 tcatccatac ttgcctcac7 ccccgtcctg tagataacta cgatacggga 
agtaggtatc aacggactga ggggcagcac atctattgat gctatcccct 

3101 GOGCTTACCA TCTOCCCCCA GTGCTGCAAT GATACCGCGA GACCCACGCT 
CCCGAATGGT AGACCOGGGT CACGACGTTA CTATOGCGCT CTOGGTGCGA 

3151 CACCOGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGO AAOGGCCGAC 
GTOGCCGAGG TCTAAATAGT CGTTATTTGG TCGGTCOGCC TTCCCGGCTC 

3201 CGCAGAACTG GTCCTCCAAC TTTATCCGCC TCCATCCAGT CTATTAATIC 
GCGTCTTCAC CAGGACGTTG AAATAGGCGG AGGTAGGTCA GATAATTAAC 

3251 TTCCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCXSCAACG 
AACOGCCCTT CGATCTCATT CATCAACCOG TCAATTATCA AACGCGTTGC 

3301 TTCTTGCCAT TCCTACAGGC ATCGTGCTCT CACGCTCGTC CTTTOCTATG 
AACAACGGTA ACCATGTCCG TAGCACCACA GTGCGAGCAG CAAACCATAC 

3351 GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGACTTA CATGATCCCC 
CGAAGTAAGT CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT GTACTAGGQG 

3401 CATGTTGTGC AAAAAAGCGC TTAQCTCCTT CGGTCCTCCG ATCGTTGTCA 
GTACAACACG TTnTTCGCC AATCGACGAA GCCAGGAGGC TAOCAACAGT 

3451 GAAGTAAGTT GGCCGCACTG 7TATCACTCA TOCrTATOGC AOCACT GCAT 
CTTCATTCAA CCGGCGTCAC AATAGTCAGT ACCAATACCG TCGTGACGTA 
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3501 AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCnTTCPG TGACTOCTGA 
TTAAGAGAAT GACAGTACOG TAOGCATTCT ACGAAAAGAC ACTGACCACT 



3551 GTACTCAACC AAGTCATTCT GAGAATAGTO TATGCGGCGA CCGAfiTXGCT 
CATCACTTCG TTCAOTAAGA CTCTTATCAC ATACOCOGCT OGCTCAACGA 



3601 CTXCCCCOGC GTCAATACGG GATAATACCG COCCACATAO CASAAC TTO 
GAAOGCOCCG CAOTTATGCC CTATTATOGC 0CC5GTGTATC GTCTTQAAAT 



3651 AAAGTCCTCA TCATPGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGGAT 
TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCmTC AGAGTTCCTA 



ApaLI 



370 L CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT GCACC CAAC T 
GAATOOCX»C AACTCTAGGT CAAGCTACAT TGGGTGAGCA CGTGGGTTGA 



3751 GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG AGCAAAAACA 
CTAGAAGTCG TACWUUUVTCA AAGTGGTCGC AAAGACCCAC TCGTTTTTCT 



3801 GGAAOGCAAA ATGCCOCAAA AAAGGGAATA AGGGCGACAC GGAAATOTTO 
CCTTCCGTTT TACGGCGTTT TTTCCCTTAT TCCCOCTGTG CXnTTACAAC 



3851 AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAC3CATT TATCAGGerTT 
TTATGAGTAT GAGAAGGAAA AAGTTATAAT AACTTCGTAA ATAOTOXAA 



3901 ATTCTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA 
TAACAGAGTA CTCGCCTATG TATAAACTTA CATAAATCTT TTTATTTGTr 



3951 ATAGQGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTO ACGTC TAAGA 
TATCCCCAAG GCGCGTC3TAA AGGGGCTTTT CACQGTGGAC TCCAGATTCT 



4001 AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC 
TPGGTAATAA TAGTACTGTA ATTQGATATT TTTATCCGCA TAGTGCTCCG 



4051 CCTTTCGTCT CGCGCGTTTC GGTGATGACG GTGAAAACCT CTGACACATG 
GGAAAGCAGA GCGCGCAAAG CCACTACTGC CACTTTTGGA GACTGTGTAC 



4101 CAGCTCCCGG AGACGGTCAC AGCTTGTCTG TAAQCGGATG CCGGGAOCAG 
GTCGAGGGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC GGCCCTCGTC 



4151 ACAAGCCCGT CAGGGCGCGT CACCGGGTGT TOGCGGGTGT CGGGGCTGGC 
TGTTCGGGCA GTCCCGCGCV GTCGCCCACA ACCGCCCACA GCCCCGACCC 



ApaLZ 



4201 TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA CCATATCGAC 
AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT OGTATAGCTG 



4251 GCTCTCCCTT ATOCGACTCC TGCATTAGGA AGCAOCCCAO TAGTAGGTTG 
CGAGAGGGAA TACOCTGAGG ACGTAATCCT TCGTCGGGTC ATCATCCAAC 



4301 AGGCCGTTCA GCACCGCCGC CGCAAGGAAT GGTGCATGCA AGGAGATOGC 
TCCGGCAACT CGTGGCGGCG GCGTTCCTTA CCACGTACXTT TCCTCTACCG 



4351 GCCCAACAGT CCCCCGGCCA CGGGGCCTGC CACCATACCC ACMCC GAAAC 
CGGGTTGTCA GGGGGCCGGC GCCCCGGACG GTOGTATGOG TGCGCCTTTC 



4401 AAGCACTAAT AGGAATTGA7 TTGGATGGTA TAAACGGAAA CAAAAAAAAC 
TTCGTGATTA TCCTTAACTA AACCTACCAT ATTTGCCTTT CTTTTmTC 
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4451 ftCSCTGGTACT ACTTTCnTA AAATTATTTT ATTATTTGAT TTTATTTAAT 
TCGACCATGA TGAAAGAAAT TTTAATAAAA TAATAAACTA AAATAAATTA 



4501 AGTATATATT ATATTTTGAA CGTAGATTAT ITIVIT GAAA GTTOCTCJTAC 
tCATATATAA TATAAAACTT CCATCTAATA AAACAACTTT CAACGACATC 



4551 tGCCATTGAT TCGTAACACT AATTCTGTAT TAGTCATTCC TCTTGTTTGA 
ACGGTAACTA AOCATTCTGA TTAACACATA ATCAOTAAOG AGAAC A AACT 



4601 TAGTATCCAA AAAAACGGCT ATTTTTTTGC AATCTTATTT OCTOCATATT 
ATCATAfiGTT TTTTTGCCGA TAAAAAAACG TTAGAATAAA 0C3A0QTATAA 



4651 ATACAGATAA CATAATGAAA GAAAAAATCT TTTmTTCT TCTTCAATQA 
TATGTCTATT CTATTACTTT CTTTTTPAGA AAAAAAAACA AGAAGITACT 



4701 TGATTICAAC CATTCTTTTA AACATT5A1C AATTCCTOAC CAAGAAOCCC 
ACTAAAGTIG OTAAGAAAAT TTGTAACIAG TTAAOGACTC GTTGTTGGGG 



4751 ATACACACTG CTTTATATAC CGCCCCTTTT ACACTT GAAG AAAGAAATAG 
TATGTCTGAC CAAATATATG CSCGGGGAAAA TGTCAACTTC TTTCTTTATC 



4801 AAATAGAAAT AGCAAACAAA AGATATGACA G TCAACAC TA AGACC TATAG 
TTTATCTTTA TCGTTTGTTT TCTATACTGT CAGTTGTGAT TCTGGATATC 



4851 TGAGAGACCA GAAACTCATG CCTCACCAGT AGCACAGCGA TTATTTCGAT 
ACTCTCTCCT CTTTGAGTAC GGAGTGGTCA TCGTGTCGCT AATAAAGCTA 



4901 TAATXjGAACT CyU^yUVAACC AATTTATGTG CATCAATTGA CGTTGATACC 
ATTACCTTGA CTTCTTTTOG TTAAATACAC GTAGTTAACT GCAACTATQG 



Aval 



4951 ACTAAGGACT TOCTCGAGTT AATTGATAAA TTAGGTCCTT ATGTATOCTT 
TGATTCCICA AGGAGCTCAA TTAACTATTT AATCCAGGAA TACATACGAA 



5001 AATCAAGACT CATATTGATA TAATCAATGA TTnTCCTAT GAAT CCAC TA 
TTAGTTCTGA GTATAACTAT ATTAOTTACT AAAAAGGATA CTTAOGTGAT 



5051 TTGAACCATT ATTAGAACT: TCACGTAAAC ATCAATTTAT GATTTTTGAA 
AACTTGGTAA TAATCTTGAA AGTGCATTTG TAGTTAAATA CTAAAAACTT 



5101 GATAGAAAAT TTGCTGATAT TGGTAATACC GTAAAfiAAAC AATATATTOG 
CTATCTTTTA AACGACTATA ACCATTATOG CATTTCTTTG TTATATAACC 



5X51 TGGAGTTTAT AAAATTAGTA GTTGGGCAGA TATTACCAAT GCTCATGGTG 
ACCTCAAATA TTTTAATCAT CAACCCGTCT ATAATGGTTA CGAGTACCAC 



5201 TCACTGGGAA TGGAGTGGTT OAAGGATTAA AACAQGGAGC TAAAGAAAOC 
A3TGACCCTT acctcacca;^. CTTCCTAATT TTGTCCCTCG ATTTCTTTOG 



5251 ACCACCAACC AAGAGCCAAG ACGGTTATTG atgttagctg aattatcatc 
TGCTOGTTGG TTCTCGGTTC TCCCAATAAC TACAATCGAC TTAATAGTAG 



5301 AGTGGGATCA TTAGCATATC GAGAATATTC TCAAAAAACT GTTGAAATTG 
TCACCCTACT AATCGTATAC CTCTTATAAC AGTTTTTTGA CAACTTTAAC 



5351 CTAAATCCGA TAAGGAATTT GTTATTGGAT TTATTGCCCA ACGTX5ATATG 
GATTTAGOCT ATTCCTTAAA CAATAACCTA AATAACGGGT TOCACTATAC 
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5401 GGIXJCCCAAG AAGAAGGATT TGATTGGCTT ATTATGACAC CTOGAOTTOG 
CCACCGGTTC TTCTTCCTAA ACTAACCGAA TAATACTGTG GACCTCAAOC 



5451 ATTAGATGAT AAAGGTGATG GATTAOGACA ACAATATAGA ACTOTTOAtG 
TAATCTACTA TTTCCACTAC CTAATCCTGT TCTTATATCT TGACAACTAC 



5501 AAGTTGTTAG CACTQGAACT GATATTATCA TTGTXGGTAC AGGATTGTTT 
TTCAACAATC GTGACCTTGA CTATAATAGT AACAACCATC XCCTAACAAA 



5551 GGTAAAGGAA CAGATCCAGA TATTGAAGGT AAAAG GTATA GAAATGCTOO 
CCA l TtC C TT CTCTAOGTCT ATAACTTCCA TTTTCCATAT CTTTAOGAOC 



5601 TTCGAATGCT TATTTGAAAA AGACTOGCCA AtTATAAATG TGAAGOCSOGA 
AACCTTACGA ATAAACTTTT TCTGACCQGT TAATATTTAC ACTTCCX3CCT 



5651 GATTTTCACT TTATTAGATT TGTATATATG TAGAATAAAT AAATAAATAA 
CT AAAAGICA AATAATCTAA ACATATATAC ATCTPATITA TTOVmATT 



5701 GTTAAATAAA TAATTAAATA AGGGIOGTAA TTATPACTAT T TACAAT CAA 
CAATTTATTT ATTAATTTAT TCCCACCATT AATAATGATA AATOTTAOTT 



5751 AGGTQGTCCT TCTAGCTGTA ATCCGGGCAG CGCAACGGAA CATTCATCAG 
TCCACCAOGA AGATCGACAT TAGGCCCXat: GCOTTGCCTT CTAAGTAGTC 



5801 TGTAAAAATC GAATCAATAA AGCCTTGCGC TCATGAGCCC GAAGTQGCGA 
ACATTTTTAC CTTAGTTATT TCGOGACGCG AGTACTCGQG CTTCACOQCT 



5851 GCCCGATCTT CCCCATCGGT GATGTCGGCG ATATAGGCGC C AGCAACC GC 
CGGGCTAGAA GGOCTAGCCA CTACAGCCGC TATATCCGCG GTCX3T1GGCX5 



5901 ACCTOTCGOG CCGCAGCGCG CAGGGTCAGC CTGAATACGC GTTTAATGAC 
TCGACACCGC GGCGTCGCGC GTCCCAGTOG GACTTATGCG CAAATTACTG 



5951 CAGCACAGTC CTGATGGCAA GGTCAGAATA GCCCAAGTCG GCCGAGGGGC 
GTCGTGTCAG CACTACCGTT CCAGTCTTAT CGGGTTCAGC CGGCTCCCOG 



6001 CTGTACAGTG AGGGAAGATC TGATATTGAC GAAGAGGAAC CAAXGTAACG 
GACATGTCAC TCCCTTCTAG ACTATAACTG CTTCTOCTTG OTTACArPOC 



6051 TTACACTCAA GAAAACACAC AATAAACGGG AAGAAACGGT QT AAAAG TOT 
AATX3TGACTT CTiTTGTCTG TTATTTGCCC TTCTTTGCCA CATTTTCACA 



6101 GAAAATAATT TTTGAATATC ATTTCCCTTG GTTTAATTCC AAACGAAACG 
cnTTATTAA AAACTTATAG TAAAGOGAAC CAAATTAAGG TTTGCTTTGC 



Eco?.I 



6151 TCTTTTTTTT AGAGAATGGG AATTCTTATT GGATCTCTAG ATTGTXTCIT 
ACAAAAAAAA TCTCTTACCC TTAAGAATAA CCTACAGATC TAACAAACAA 



ApaLZ 



6201 TACTCCAGAC TGTGCACAAA AACGTTTGGA TGGATGATCA GAAGATATTT 
ATGAGGTCTG ACACGTGTTT TTCCAAACCT ACCTACTAGT CTTCTATAAA 



6251 TTAGGCTTAG CTCTAAATAT AAGAAATGAT GCTT GAAAAA CCAGAC AGA A 
AATCCGAATC GAGATTTATA TTCTrTACTA CGAACTTTTT GGTCTGTCTT 



6301 ATTGAGTTTC AAAAATTGGT AATGTGAGGT ATTAGTCAAC TAACCAAATA 
TAACTCAAAG TTTTTAACCA TTACACTCCA TAAtCAGTTG ATTGGTTTAT 
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6351 ACAATOCAAA CCQGTTGATA CATTTCATTT T GAAAA TAAT GAAAC TQGAA 
TCTTACGTTT OGCCAACTAT GTAAACTAAA ACTTTTATTA CTTTGACCTT 



€401 TTGGATGACC AOCACACAAA CACATAAACT AATTATOGGA AT TAOAAGOG 
AACCTAC10G I WimW lT GTGTATTTCA TTAATACCCT TAATCTTOaC 



6451 AACATAGAOG ACTACTTOGC CACGAACAGA ATACAMnOG GAACAC TATT 

rroTAicrcc tcatcsaacco gpG c nxrrcr tatgttcacc cttotgataa 



6501 TTCTCCATTG TTTTAGTrCT OTmTl T G T CAGCCTAOT TTGTGCTATG 
AAGAGGTAAC AAAATCAAGA CAAAAAAACA GTCOGATCAA AACAOGXTAC 



HindlXI 



6551 TGTAAAAAAT ATTQCCAAGA AAAAAAQCTT WmGTOGC CMStOTCCGA 
ACATTTTTTA TAAOXSTTCT TTTTTTCGAA CAAAACAOCG GICACAOOCT 



6601 AAAAAATTTT GOCSGAATCTT OGGATTAATT TATOTTTTCA 
TTTTTTAAAA CCCCTTACAA GCCTAATTAA ATACAAAAGT 
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Sequences with unknown function, C. albicans sequence NOT present in the public domain 
(ALCES/EMBL) 

>328c2 I803bp in-house: 1123-1803 public: 1-436/468-1021 PathoSeq: 
437-467/1022-1122 

ATGTCTATTACAGTrACATTTCCOAAATCTCCATCTACGAAAAAACOTCXrACCG 
GCATTTGGAATTGAGTrGOAGTTYAG 

TCAMCAAGSCAGTAGCOATGGTGCTATAGAOAAAAGCGGCATrGGCAGTTCCr 
GTGTTTAGCGTTGACAACCAAGACTWT 

Cn-ATTKATAAGAGAYCWTGCCAAGTACTGGO<jCTACCCTrCATCGTATCAATT 

GATTGTCAAGTTGGTCAAATOTGCTAA ^ 
CAITGAAAAGTCGCAA.'^TCTrAAAGACCGATAAGGATrTGAATAOAGAGTrGT 

TrGAGTrGGATTTGATrGAAGAAGCAG 

ATACAAAGATTGATCTITmATATTTCGTrACCCTTGGTCTATrCAAOAATAGA 
AAATAAGAAOGTTTnTATGTTCTG 

CGTGAACCAGAACAGCCAAAGGTGTCOAAAGCMCCAACACAAGAGAAACCAG 
CAAGTGTGGTTGCTGCAGAAGAAGATGA 

CGATAATCTAGATGATGATGAGGAGGACGAAGTGGATGAAGACATGGATGAA 

GATAATGATAATAGTGGGGAATTGTCTA 

AAGGATACAAGCACATGCACAAGGACCATCCAAAGTATATAAATGACGATAG 

GGTTACTATTGGACAAGTGTTTCATCAA 

TACGGACTTGACCCrrCGACACCATTAACCCATrCACTnTCAATAGTATCAAC 

TCAATGTCGAAGCTAA.\CTATTACAA 

GAATmGGAGTrTCAGGTrACCGATTrCTTCCCAACAGCAAGTTATCTTATGC 

AGAACGAGAATTGGTGTTGAATGCCA 

ACAACTACAATGATATGCACATTAACGAAAAGACAGAATCCAAGCCGAAAAA 

GAGTrTCCGTAAACCCATTGGAAAGTCA ^.™.^^ 
AAOAAACATAACTTGCAGAITGATCCGAACTCCATAGAnTAAGCGAGTCAGT 

GATTCCGGGACAAGGGTTTATACCTGA . 
CnTAGTATCCACCTATCnTGCAAAOTCCCTAATTATTATGTGACATCAACCC 

ACCAAAGTCTCCCGCTGTCGTTC AAC . ^ . . ™o 

ACAAAGAATCTTAATGCAACrTCGAACTCTrCGTATrTGTTTAATGATAATGTC 

AAGATAAAGTCAAAAAGTATTCAGAA ^ ^ ^ 

GTWSGTGTTCAACAGCGATACCGATAATrACCATCACACAAAGTATITCTACA 

CCAAAACCTACCGTGGTCCAGGGTCGG 

GGAATTACAAGGATGGTGCATTGATGAACAAAATCAACAAGATACATCnTCC 
AGTAATAAAAAGCCGCGCCACAAGAGA 

AAGGTGTCGAACAATAACAGGTACAACAAGAOTrTAAAGGGGTTAGTCCACG 

AAAAGTTTGACAAGA^CTTTGTTGAGTA ' 

CrrGCTrrCTGAGCAACGCAAGTATACCGAGGACTATTCCAATCTTGAAATnT 

ACACAATAGCTTACAGTTTAATGTTC .^.r^.r>K^r> 
mTGAATACGTATCGTGGTGTTGCCCAAGAGACATGGAATAACTACTACAAG 

TTTAA ATTGATTGATrrCG AACAATTG 

AAGGCnTGCAAATGGAGGCAAATGAGCTTGAGGAGAGAAAATTGGATGCTG 
CTAGACACCAACAGTGGGCGGAAGAAGA 

GAAGCTTrNCCAAGAAAGATTGCGTTTAGTATrTGAAGATGAACGGACGAGTr 
TGAGCAATTGCAAAGCGAGTTTGGTCA 
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GAGAAAGAAGGATTrC3GAAGAGAAATrGCGTCGCCGTCAGCTANANCX:ATCTr 

TGANTGATAOmTGAACTTGATAGCG 
AAAATGACNATOAATCTTGACTTONCCAAANTNAACAAGACTr 



mGllcTCTT?GiTATiCUCTATaCACTCa^ 

ATCnAAarrUTCTCTaAAACaCAiTCAATGTATAAAAaUTUWATmA^ 
TATAATGAACUCAAmCTAAAA»K;ATO?rATaA<UAATTATACGCTJ^ 

caxrr*<rrTT3ccAA<»naxAAO7rTaATAcm<acAA0GTTCCG^^ 

AATCTACAAAAAAACATTCACAATmAAACAACCmATATATUAIjkTT^T^^ 

TGUCATCAT«TTTCCTCn»AATATACCnTAAaACCAGATTC^ 

(nrUGACATICaTUTGACACAAAAGATTOTGaATATTTrrACCCTC^ 

CACAncrrTCTATCACCAATACCTmCCTAACACAGGAACai^ATTGAC^ 

TaCTACAAATCAAACGAmACAATAGTCCCAATCTCAAATaTCTATAlTm^^ 

OrACATAATAOCAATATCTAAAAOTaAAATOCTACrGTACCTTaACmCTCCTTaTC^ 

ACTTCCTUTCTaAAAATCATCTCTTCACACATTCCACGTTGT 
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TTTTmTAC4*TATWmC*TCT(:TT?nAaAmCAACA^^^ 
CaCCACCAJUAaTC4T*<rr<»aCTrUTTCAAGA4TATATTMTUCCATTAm 
TAOatCTAAAACCmATATCTUATTAATTAAATACCAiAAU 
ATTAUnAUmCATnCATTTCTOaACTaTATAGTWrMTACaCTM^^ 

maUATAACUnATUTAATACTAATAATAUCGAAjmACAAacmAA^^ ^ 
ACAA(UACAAAAAACaCAAGACOCCTACTAGATATAirr(»CTTAAU i , 

<XAaAA<UCaCCCAmAACAAG*ATCAnACCTOCA2aCCTe^ ~'J 
TTCnCTTTUTAATATCCACCiCaCCACCTOMnWTCTTfiATaTWC^ ^ 
TATATCCrrGnCTTOTCAT*CTaitCA-GA<XA(XACCAO^ 

TanGCCTTCT<uTTCT<OT*cATiAa7Tcrr<TOsrrcnS^IJS5S^ 

ACTACCmAacaiTATTCCTCTTTTCACATTCmATATTTATJ^CTC^ 

GATTCACTATATACaCTTWAAamaTAACamATCTAAACTTCATATAACATGCATTAGCaTGATaTCaG 

AAcrAaerreAATcre 

: QOSVVPQSCP HYSQ07C0KC MFSCCGGCHG HTOQQQCm YCPPPFOOGY 
SI TOQCPCCGCC TVg&KCOC? ByVQOQPSSC CMOSCIMCCL AaCTCCTlD 
131 Klf 



<KAACATCTAaCTCCA(rrTrTTTM70TiAT<nTAC*CAA(KjUAC^ 

CnCrACAAAnACGAiAAArGmCACATGTAltUAAAAamATCTATACTATTTCTOTCCUa^ 

AATGATACTGATATCTCCTArTAGGATACAGTTATCTApATaGTATAATAATAATC 

TCGATGCACTTaCCAGaAAACAATACAACCCATTTTGCAGCAAAATCAGAaTTOAaGmA^ 

ACAAmCTCCATTCAAATAATTCCACAATAAaAAArAACAUGAACAAACCTACTAACAJaA^ 

CmGAAAATCmACATACTCAACnCTAAAGAmATAATaSGATCOTATTCATCA^^ 

TGCAGGTGATTATGACCCAGGTGUiCAATTCmACTAAAAATCTA-SCAGnGmy^^^ 

CTCTCTCTAACCTATACAACATAACATTTGTAAlCGGmGAATAJCaCAACGTC^ 

CAAATTTaUTtUTATAncmATC7CAA(nATAGCAAATAaAGG5CAAAAGOrreCAACAAAAcAAGAAC^ 

GTCGCAATTCrCTTCACCCTTTCACAATGTCCTCCTGTATGTGATCAAT 



AACCTiAmTTATATmACCAAGGTAACA0«KACCTCAnATCAT7AGTTCTCAATO 

AACACUCACncmGCTCnOCTiTTAAAAGAmTATATAATCAGGATAJLUGUTT^^ 

CAGOCACGCTAAATCATTCrrcnCCCTATAAACCAAAAATCTTATATGTaCAACnAAOTA™ 

AmACmACAGTGAATCATTaATATTnAATOAAAGCGACnTAGCTCAATGTCTTCAGi^ACAA 

GCACCACCAACAAAAGCACCAGaGCCrCCATOUTCTGGCTACAATTCCaOa 

CTCCATATaTCATCATCATCAaiSATAAGCCAGTATATCCACJUAAAGOCCrrCTCA^^ 

aCCAATAAAAATAACTAAACAACAAGTACOkCCTAAACAAATAGGTACATCTC^ 

TCCACTCATaTAAnCATCTT:CGATT:AAGTGCA(XTTCTmTTTTCTCA7^^ 

(m-ACTCACAGATGATATAGAGGACATATTAGAGCACATACACGATGCTGAGATATACGATGCTGACAAOOTACCATAA 
CATATATAACTTCTAAATCATCCTAATArACATTATTAATTATTTG 
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>233c_cpl_<uU SOObp m-acuM l-SOO bp , _ . . . , 

GUUATCAiACAAC*ACUatCArAiOXUCTanCTACCAUTCTACm*GCAUm 

ACmATCCTUAaX6TTaiACCACT*CUCAanTCATm<MCACAG^^ 
UCATaA3AU(OTCACCATCCCTACCT«CAA<3TTCUarcA*CACCAACTaUC^ 
AAAnACTTACCAATOrrjrCAATAAiC?TCAAAATAAT<aTCATn ClAaUTTC ATTAATa 
TAAAAUTCCAAATCrGGmGAAAAAArrAmAACAA(UACTA^ 

AATOacrrTTmAnTrrArrniTTmnTTAGTTTAGTTTrcATTrai n i lui u i lun rAATrrccAATA 

CAATATAAT6TTTATTTTTT 



>22g3 (5") 535bp in-house: 1-535 

AGGTTCCAGTrACCAAnTAGGAAGTGTGTTGCAAGCAGGGCTACCAAATATO 
GGTGGCAACACATATGGTAGTAAGTGC 

TACCAATGTOGGTGCA.AAAAATnTOCCAAOTAAnTGTATGGCAATAACAGA 
AGTGTTGGCGGATTCNAACTGAGOAAT 

CnTGGTGTGTAAAAA.AAAAGCAATAGCGACTACGCTACAANAGGCAATCNAT 
TATrATTATAAAGTGGA.\GTTATATAT 

AT^^TCTCGGGGGGGGGGGGGG^^TNGGN^^^CCCCCCCCCCCCCCCCAN^^IT^ 
TNTCGGCCCNCCCACCNTNCGGCCTTC 

TGGCTCCCCCCCNNCGGGCCNCNNGTAAATNCCTCCACCCNGGGANAANGGNA 
AANGGGGAACNANNA.AGGGGGGACNNN 

NCACCCNATGGGAGGGAAAATCCCNAAN>fnTNCCCCCCCNNCCCNOCCNAAN 
CCNCNTGGGGNGGGCCAAANNCNGGGG 

GCTNCNCNCCCTNCCCCCCCOCCNTNNCCCNNNNWCCNNCGANCTCT^ 

GC -p,^ n 

>22g3 (3') 426bp in-house i-426 

CCCCCATATAACGTTGTCA.VrAGCAATACTCTGTCGCACCCATAGTGTGCACTr 
CTCGGTGGTATAAAAA.AAA1 J 1 1 1 IC 

TCCCAAAAAAAATCTraCCTrrCCACCACTTITTTCTrCrrCTTCCTrCCCCAT^ 
CCCTCCCAAATCCCTCATTTTCCC 

CAmCCCCTACCCrCCTGGCCCTGTATTCCAAAATnTCTCGGGGNTACGCCC 

CGAAGANAACCTCCCTCCCACCCACC ^ 

CATCT^^GTCNGGNTTCGACCT^CGOCCTCANGGCTCCACCGTCGGGG^^•C^TO 

TATATTTGTAGACTCCNGGAAAAAGG 

GAAAAGGGGAGGAAGA.\GGGGGGAAAAAAAAAAAANGGAGGGNGAATCCTT 

i riNl 1 1 IN CCCCCCNTCTCAAACCNAAA 
CCCCNTNTGGGNGGTCN.\A"rTAGGGG ^ 

ll 
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>35gK 1334bp in-housc: 146-669 public: 1-145 PathoS q: 670-1334 

ACAACCTATAATCGACAGTrrACTATATCTGCTGACTTCAAAACCAATGCATrC 
TrCAAGCGTGCTCTGTCGATTTCTAT 

CATAACATCCACnrCCGGNGTAATCGGATTACTAAAGCCACAGAATCAAGGT 

GAACATCAAGCrrCAACilCl 1 ICITG _ ^ 

GTCCACGAATAATmAAmO0TrhmSKKGSMAN«GCriTCTACRGTAGOTT 

TGAATCnTCCAACATTGTCnTGCA 

TAOAAACMGCACCAGACAAGAAACATGTCCACTCGACCATCAACYTSKGGGT 

AWWGACAAAGTWAATCTGTCTGGATCCT 

mCATCCAGTTrCCCTGCATKGGAWACAAGTNTGTCCCGCACAGTTAACACT 

Gl 1 1 Tl AnTTSKTGGTATTAGACTCA 

TCAAGTTCCGAAGGAGAGGCATCAnTARGGGWATAGACTCCGCTGAGTTAAT 

ACTGOATAAATCACTTArnrCAGATrC 

ACTGACrrGTWCTTCAGTGACCTTATCAAAATCCTCAATGTACTCSGARGCGTW 
TrCMCTCMATGTGAAGGCnTTAAAA 

GGGCAACRCrGGmCAMATGCnTCTrGCRAGTrrGTACKTGACAGAAAAA 
TCAAAAACYTTGAAAGATATACCTCTT 

f5 
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CTAAACJTCTmAAATCAATTTCTTTn-CCTAATrmCATCATATAGCTTATGAC 
TTGGCAAACCCTCCnTACATACCAT 

ATCCATTACAATGCTAGAAATGTCAATCTTCACTOACGATATAAAGOATOGAA 
GAACTTCAAATAATnTATAAACTCAG 

OATTGGCTGGTGTATCTGCTQCAGGAGCTCCAGATnrATTGTCCATrrGCTCAC 

TCCATGGACATACATTATTAACGTCC 

ATCrrmCCATTCTrCAAAmCrrCGGTGAAATAAATrCGTrGACGRWTnTA 

AACAGACGTACAATGTGAAAOATAA 

GATCATTAGCAGAGAGCAATrCOAOACTCTTGCTTOAAAGTrTGATTGACACO 
nTTGTTGTAACATATTGTAGGTGGCT 

AAAAGATTGACrrVWlOTAAAATORAACrrrATrAACCCTGGGCCCTCACATITC 

ACATmrC ATCTTAAACAAAGKGGTT ^ ^ 

CAAAGKGGAACTTGGTrTGGATCCVFTAWTGOAAWATrTCYCAGKRAATACTT 

TCAAAATCAACTCCAGGAGAGCCACAG 

TGATAATTGAATTGGATrTAOATAAGCGGTTAAACTTCCCAATITCAGTnTAC 

CAAACTCTGGTAAATGAAGGTTAAGT 

TrrGTGTCCACCACAACAAGmACTAAAAACAOCCTTGAGCATITTGGAGGCA 

>36g2 (5*) S20bp in-house: 1*520 

CGTATAGAGAATAATCCGTTOAAATrOATrGTTCAATCATTATrGTATCTnTCC 
CI I I J 1 i 1 I GTTCTAACCATAATGT 

TAGAATAATTAGAAATrGTCTAAATATATATTCAGTTTAACAAAAAACAGAAT 
GCTTGCAATAAGAnTGATTTCTAATT 

ACTAATCGTTAATATTTAGTITGGTGGGGTnTA'nTATCOAAOATGTAGCATT 
ATTTGTATCNAATAGATAAAGAAACT 

TGAATTAAATGGCOTAATTTCTTGCAATAGTAAAAAAGAAGAAAAGTGGTAAG 

GAGTGAGTGAAAATATmTTGCCCCA 

ATirGAGTNGAAATCTTACACCNAAAAGTTTGGACNAAAAAGTmTACTAAA 

ATCTGANAATCTNCCTGAATAGAACCO 

ATCATCCNCATNrCCGArrrCNTGAGGANAGATAGTGGCCCCACCTCNTGGTG 
ATTAGAAGGAGCNCCCATOmTACAA 

TATCTATATCCAGAATAACNTGTITGTGACCTCNCCCCNG ^ ^ 
>36g2 (3") 472bp in-housc: 1-472 

CTCTATATATAGTGAAATATAACATCAAATAATGTACAAAAAAGTATAATAAA 

TTGATTTAGAAATGAGAAAAAGAAAAA 

AACrTGAAGTAGTGAAGATATATTTOTTGGCTATCTTTCTTGGTATGGCTCAAT 

TCAGCCAATCTrGGATGA.\AGGTrGG 

AGTTirAGmCGTGGmATTGATTrGTAAGTACnTCGGGCTAGAAAGTTNA 

CAAACATGATTAATCTTGATATANAT 

ATrrGTrAAACAmGGTGCTCC>n-CTTAATCNCCCAAAAAGTITGCGNCACTA 

TCTTTCCNCCNGAAATCTGTATATGT . 
TGANTGANCCG^^•CCATTCCTGTTNA^mTCNGA^^T^AOTTAAAACCTTTTTG 

TCCCAACCTnTGGGGTTAGANTrCN 

NCCCCAKTGTTGCCNNA.^ATATmCNCNCCNCCCTNCCCCmCCCC^r^TTAC 

NAATGCACCAAGTAAGCG ^ 



•106- 
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>38gl 1348bp in-house: 1 83-940 PathoSeq: 1-182 / 941-1348 

TCTCTGGTATAACTTGCACTACCTCATCGCTACCCCGGAI 1 1 l i 1 1 TIGGTATGA 
TCTACACGTCCTCATCGCTACCCCA 

GA l 1 11 1 1 1 IC TGGTGCGCCGGACACGCCCTCCGGTCCGCACCGAAAACCGGGG 
TAATCTCCGTCGGAGATACACATCCG 

CGGACACAAAATCAGATGAGCTACCACCGAAAATTCCGAAAnTCAAAAACTC 

AAAATCCCTAAAAACAAACTATCCAGA 

NATTATTGCCATGCCCTGAOOATGAGmAGTrnTTAAri 1 1 iGAAAAATOTC 

CAAAACTGGTTGTGCTGTATAGGANG 

GGTAAGAATrTGCCATTCTGCCCCTrTGGQTGQGTCAOTCNAAAAAAGANGTA 
TCACTCTGGTTCNAACGGGAAACAACN 

NAAAATGGGATTAAANrrWATCTCCAGAMCAAACTTAQCTTMWWACACCCAY 

TTTAGTTGTACTSGYGWRCCMAAMMCMAA ^ 

TTnCCATnTGTTTGCGGANGGGAATn'ARACCAAA^V in i n 1 1 1 iGAAATTT 

CGCTMAGTGTYMAOAMCCSCAAAAO 

TCACCTirmCGTrrrCN^ICYACGGCAIUROCYCACCGGTTTrKYKTGGKGS 

MCRGCCMAATTGAWnTGTGGGTGSGC 

ACGKGGAAAAACAGTIXGTTAGTGGACACOTTnTGCAGTGTGAAACTGCGCT 
CGGAGGTACTATATGCGAAAGCAGAAA 

AGACAATTGCAAGAATACAGAGAGTTCTTCTCTGGGCTANNGCAATGTGTITA 
AGGCCAAGTCGACGAGTGGGGAGAGTC 

TGGAAGTGATATACACATCACGACCTACnTATACGCTACGTTCGGCATGGGC 

GAGCCACTGTACGGTGGC AAGCCTGAA . „ . ^ 

CAGTCCCACACCAGATATCTAACGATTCTGTGTATGGGCACTGATGGGATTTAG 

TGGATTACTAGCTGATAGCAAGTATT 

GAAAACrAAAACCCGACTCGGGGGTATGCCTTGGCAAGTAGCCGGAGTAAAAT 
CTGTGACTITGCTGAGTGTAACTCCCT 

CCATGGTTGGCGATGTTCGACGTGCGCGGCAGTTCTTGTCGTATCACAGTCGCA 

CGGACACCACACCGGGAGAATCTTAA ^^^^^^ 
GAGGGCTATATGGATGTGGAACGGTTTGCTTGCTGTGGTAAAACACTGGCGGG 

CGAGCCGACGTTCCACGGACACAGCAA 

TGTGTTTGCAACCAAATAAATAACTTGTACGGnTGAACGTGl 1 1 1 iGGCTGCT 
CCTTCCAGTTCTTGGCOGGAGAAGCT 

TGGGCGCGGGAAGACCACTACTACGTAGTrATCTGOTrOATCCTGCCAGTAGT 
CATATGCTTGTCTCA 

llo 
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>60gK990bpin-house: 445-752 public: 1-140^53-990 PathoSeq: 141-444 

ATTACCGATCCGTCGGArnTAAAACCACAAAATTGCCTOCATTAGCAGAGCT 
AGATATnrCATAGGGTGCTATATATG 

CAAAGATCTATTGAATG CACCC GTGAOGACACAATGTGATCACACGTACTGTT 

CACAATGTATACGAGAATmTACrrC 

GAOATAATAGATGTCCGCTTTtrfAAAACAGAGGTnTTOAAAGTGOTCTAAAA 

COTGATCCATTGTTAGAAGAGATCGTC 

ATTAGTTATGCCTCCCTTAGGCCTCATTGATTACGATTATTGGAGATrGAAAAG 
GTGGAATCGAAGCAAGAGGTAGATCG 

/7 



TGAGAAATCAGCCAATGAGTCAGCGCTGAATGGTAATAGAAATGTAAACAAC 
GATGTTGACGAAACTGTGCGCGTTAAAO 

ATCAACTGAATGCAGATAAACTAGGTGAAGAAAAAOGGCAAOCTCAACATGG 
GGAACAAGTNAAACGAGCAGACTACTGA 

AGTTATTCTGTTGCTATCTGATGATGAAGAGAATGGTTCTGATAOCCTAGTAAA 
ATGTCCT Amai 1 1 l OAGAGAATGG 

AATTAGATGTACTACAGGGAAA GCNT ATTGACGACTGTCTAAGTGGAAAGAGC 

ACGAAGAGOACOCCTACAGACATnTA 

TCCCCAAAAGCCCAACGACCGAAGCAAATCACCTCCTrnrCCAACCAACAAT 

AGATACCANAACNCCTTCCCCCACCTA 

CCAGTrNNGGCGTCNACAACTCCCACAGCAACTCCGACAACTACATTGTTOAA 
AGCAAACGTCTCATCTCCATCCCAAGT 

GGCGCAAAGTACAGTAAACAAGGGCAAGCCATTACCTAAACTCGATATCAGCA 
GCTTGAGTACTAAAA.AAATAAAAGCCA 

AGrrGAGTGATATGAAACTACCAACAACAGGTAGTAGGAATGAAATGGAAGC 

CAGATACTAGCATrACTATGTGATTTAT 
AATGCCAACCTTGACACCAATCATCCTGTA 
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>64gB 627bp in-^iouse: 1*627 

TNCANCCmCCATNCNCCCACKKJNNNOCCACCCCNCjCCGNNCC^^ 
CCCCCCTCCTTNGTNGCCCTCNNGGTG 

GTOmOTGOTOTOACNAATAAANATOOTNTATCATTAOAANAGOACATrGCN 

NCGGAAATGACTGTCGACAATAAAGAA 

OCAAATATATACAATGGATTATGAANGTGCTAGGATGGATTTGAAAGnTATC 

TGGGTTTATrCCAATGTAAAAATTATr 

TOTAATTGATATOGCTAATTATnTGCTCNATATNTATCACAAAAAAATQATrA 
AGTTCGAAATGAAATTGGCNTCCATA 

TATAAAATTrCTGACAGGAAGAGAAAATTCANGACNTGTrGCCCNAAAAAAAA 
AACnTACCCCNCNTCNANTCNTGTNN 

GACTTAACCCCCAAAAANAANANNGCTGOCGGCGGNAAAAAAATAGGAGGGG 
GCCGGWGTTrnTAA,\ATITNANNCTT 

GAATATGAACCCAANNTrrGNNTTCNTrrrmCCACNCCCCCTTCAAATTTNAT 
TCCATOTTCCCAAGANNAGGGNGGNG 

GGGGNGOrrCChWCTTTTAAACCNCCCCCCCCGGGTGGNGGGGNCCGThriTNT 
TTCCGGNGCkiGOfT 



>8c_cp 890bp in-house: 287-890 public: 1-124/154-286 PathoSeq: 125-153 

ATGCAATrCTCATCCGGTGTCGTOTATCCGCTGTTGCTGGGTCCGCnTGGCTG 
CTTACTCCAACTCCACTGTrACTGG 

CATTCAAACCACTGTGTCACCATCACTTCATGTGAAGAAAACAAATOTCACOO 

AAACTGGAAGGTTACC ACTGGTGTTAC . , » ^ 

CACCCTCACT0AAGTTGACACTAC0TACACCACCTACTGCCCA1TGTCAACCAC 

TGAAGCTCCAGCTCCATCTACTGCTA 

CTOATGTTTCTACCACCGTrGTCACCATCACCTCATGTGAAGAAGACAAATGTC 
ATGAAACCGCTGTCACCACCGGTGTC 
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ACCACTGTCACTGAAGGTACTACCATCTACACTACCTACTGCCCATTGCCATCT 
ACTGAAGCTCCAGGTCCAGCTCCATC 

TACTGCTGAAGAATCTAAACCAGCTGAATCTTCCCCAGTrCCAACCACCGCTGC 
TGAATCTTCCCCAGCTAAAACTACTG 

CTGCTOAATCTTCCCCAGCTCAAGAAACCACTCCAAAGACCGTTGCTGCTGAAT 
CTTCTTCAGCTGAAACTACTGCTCCA 

GCTGTCrCTACCGCTGAA GCCGG TGCTGCTGCTAACGCTGTCCCAGTTGCTGCT 

G Oi l lO l IG GCnTGGCTGCi'nGlT 

TTAAGTirATTAGAGCTTAAATCAAATATTTACAAACAAAATTTTCATnTCCC 

CCCTTTCC Cl 1 ICl I CATTCTTCAAA 

AAAGGGTrAmACTATTAATTGATAAATrTATGGTTrCATGTTAATrTACCCTr 

TrCnTATAAACAITGGTATTATTA , 

TrATCATCATTAGNTrrAmATATmCGTOAGTTTrrCGGhnTTAATrAATrr^ 

TTTGOATACATATTAAAAATTTAT 



>8S33 4Slbp in-hoine: 1-4)1 

aAATATiCOTCCAGTTCTGCG^TTCiAAJUACQCCTi l i 1 1 IT CCACCACCAgJUUAAAACTCCATTTGCCCCTCCA 

a<nc(XT3AGATAncccAarnTrA>GAccvx:AT G TTi r TO GCTAGCQCTGccnucAcaa Tnri 1 1 ! m m 

JAAaCCCCACAAAACUGAGCACACATACAAAATCAAGACaKACAAAACACAGCACACATn'AACAG aCAT TTTrOST 
Aa:ACACACTmAAaGCACACAAAAAAGA(XACmTTrCTAAGACO;CATCmCCTAQCACAaCTmAAiGA0CA 



TTGGTACTAO 




c 
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>66g4 S79bp in-house: I'S79 

CCCCGTTAACCACITCTAGOGTATACCATTTCATCTGACTOAATAACTGGTTAG 

TCGAnTGITOTTGAAGAAAAGTOAC ^^.,„ 

CACCTAGTrnTTCTGCCAACATlTTTrGCGATGAGCCGTCGACGCGTrGTCnT 

TTCTACCCCACGnTAACAATCTTG ^ 
CCACTCAATTCCCTAGCCAAATAAACTITAGACTCACAACTCTAACACTGACTC 

OTGCCCCCCTGnTAAACTCTAAATT ,„.,„^^.«„ 
ACTTCACAGAGCCmACTACCTTAAATITARGRTr^ 

TTGCAAATCACCCTGACTYGTmr .™^.„,rr 
TTTTCAGCCAGGTrmCGTTAAAATCTGACCAAAAAATITACRACTCCrATVVT 

TTAAAACTCYAAAWWACAATTAAAAC ^ 
TCAATTCAGACAAGTCCTTCTGCTCATTCTOAGTCTrCTCTATrGTCTnTGACT 

TnTGTOTGTOACTATTTTCATGAT . * * . * 

CACCCCGTTrCTTGCATITTmCAGTCAACTnTrCTCAAAATCAAGCCAAAAA 

AACACACCTTTAACTACCTATACAA 
CGCAAACCTATTCAAAACA 



>NDI (17c.cp) 807bp in-house: 1-614 PathoSeq: 615-807 

AACCTATrCCATAATGTrTACTAGATCATTGATTAAAGGTGGTGGCAGACITGC 

TACTACCAGATCATTGGTCAACAACT .^.^^^^^^o 
CTACTAGTirGGTrTTAAAAAATCAATITAAGAAATATrCAACATCAACTCCTC 

CTAAGGTTGCCAAATCAAAATCTrCG 

ACAATrGGTAAAATATOAGATACACTmTACACTGCTGTGATATCGGTTATr 

GGTTCTGCCGGTITGATCGGTTAC AA « . . 

AATTrACGAAGAGTCTCAACCTGTTGATCAAGTGAAACAAACACCATrGTTTCC 

TAATGGTGAAAAAAAGAAAACTTTAG .™^^a^*/./>a 
TTATmGGGTTCTGGTTGGGGTGCTATTTCATTATTOAAAAACTTGGATACCA 

CCTTOTATAATGTTGhrTATTGTCTCC * ™ 

CCAAGAAACTATrrCCTTTTCACCCCATrGTTACCATCTGTTCCTACCGGTACTG 

TTGAATrGAGATCTATTATTGAACC . * . ^ * 

TGTCAGATCAGTCACCAGAAGATGCCCTGGCAAGTTAnTACCTrGAAGCAGA 

AGCTACAAATATNAACCCCTAAAACTA ™ .^^^^^^a a a * a 

ATGAGTTGACACTTAACAAAGTACTACTGTCCOTTCTGGTCATTCTGGTAAAAA 

TAanrCCTCTTCTAAATCAACTGTTG .^.^^a^a-t-t^a 

CCGAATACACTGGGGTTGAAGAAATCACTACCACCTrGAATTATGACTATTrA 

GTTGTrGGTGTTGGTGCTCAAACAATN 

CTANTnTCGGNAATCaGGGAGNCGOfTGAGGAANTrCAACCCci i 1 1 1 iGAA 

AGAANGNCCAGTGGANGCCNTCTGCN 

AA1TAGA 
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>H0L1 (409cS) part2 762bp PathoSeq: U762 

OATCAGAATAATGAGOACTTTATACCTOGAACACTCAATATCTATTCCTTGOAA 
OTrGACTCTOAAGATGAAAACGTGAG 

TCATTACOATGCTTCCAGTCGACCAAAAGTGAAAACAAAAOGCAATATAATCC 

TCTTCCCACAACCATCGAATTCATGCA .^^^^ 

ATGATCCATTAAATTGOAGTAAATOGAGAAAGCTAAGTAACnTrnTATrGTCA 

TmrATrACTGCmTACAGCAGCt 

ACTTCAAATGACGCTGGATCAATTCAAGATrCACTTAATGAAAAATATGGAAT 

TAGTTACGACGCAATGAATACAGGGGC ^^^^^^ 

AGGCGrnTATnTTGGGTATrGGATGGOGTACl'lICri 1 1 1 AACACCTOCTTCO 
TCGTTATATGGTCGAAAAATAACAT 

ACmATATGTATCTTrCTTGGTTTATrAGOCGCTOTTrGGTTTGCCTTGGTTAA 

AAGCACTTCCGACTCAATTTGGTCG _ 
CAATrGnTOTTGGTATTAGTGAOAGTrGTGCTGAAGCTCAAGTACAATrAAGT 

TTATCAGAACnTATnTGCCCATAA 

CCTrGGTrCTGTGCTTACGTCCTATATrcTTGCAACnTCCGTAGOTACTTACTrA 

GGACCnTAATTGCAGCCTTTATTG ^.^^^^^^ 
TTCAAAACATTGGTTITAGATGGGTrGGrrGGArrGCAGCAATTATTAGTGGTG 

CATTATIGTrCGTAATrGM I i I IGT 

•rTAGATQJJkAACCTATnTOATCGAOCAAAOnTACCAAGCCA 



^7 



2J 



>GAL2 (360c6) 1004bp in-house: 62S-1004 PathoSeq: l«624 



TCCATmCCCrmCCTCnTTTrCTACATCATCCTCACANCAA'nTCAAATATG 

TCTCAAGACAACGTCTCATCAACAT ^ ^ ^ ^ 

CTACAGCTGAGGCTGTAAATAATGAAATCAAAGTCAAAGATGAATTTCCACAA 

GAAGAACAAGCTCATACTAGTTTAGAA 

GATAAACCAGTGAGTGCATACATTGGTATCATCATTATGTGnTCCTTATrGCC 

mGGTGGTTTTGTTTTCGGTTTCGA .^.^^^^^ 

TACTOGTACCATTTCTGGTnT ATTA ATATGTCTGAL 1 1 1 1 1 AGAAAGATTCGGT 

OGTACTAAAGCTOACGOTACTCnr ^^^^ 

ACTTirCCAATGTCAGAACTGGTrTAATGATTGGTITGTTCAACGCTGGTTGTG 

CCATrGGTGMWrrATYdTGTCYAAA 

GTCGGTGATATOTATGGTAGAAGAGTTGGTATCATGACTGCTATGATTGYCTAT 
ATTGTTGGTATTATTGTTCAAATTGC 

TTCTCAACATGCTTGGTATCAAGTCATGATrGGTAGAATTATYACTGGTCTTGC 
CGTYGGTATGTTATCAGTTTTATGTC 
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CmGTTCATTrCCOAG<TmCTCCAAAACATITOAGAG<n"ACTTTGGTGTGCTG 
TTTCCAATTOATOATTACCTTOGGT 

ATCTTCOTGGGNTATTGGCTACCTATGGTACTAAGAGTTACTCAGACTCTAGAC 
AATOGAGAATTCCATTAGGTTTATOT 

TTCGCCTGGGCnTATGnTGGTTGCTGGTATGGTTAGAATGCCAGAATCTCCA 

CGTTACCTTGTCGOTAAAGACAGAAT 

TGAAGATGCTAAAATGTCACTTGCCAAAACTAACAAGQnTCTCCAGAGGACC 
CAOCATTATACCGTGAACTTCAATTAA 

TCCAAGCTOGrGTTGAAAGAGA AAGAI TGOCCGOTAAAOCATCTTGGOGTACr 
TTATrCAATOGTAAACCAAGAATCTTT 

GAAAGAGTTATrGTrGGTOTCATGTTACAAGCCTTACAACAATT ^ -71, A 



>KGD2 (98c_cp) 334bp in-house: 1 39-334' public: 1-138 

TTCTAACAACAACATCmCTTGGATCTTCAATCAATrCCTTGATGGTrCTTAAG 
AAAATAACAGCTTCACGACCGTCAA 

CTACTCTGTGGTCGTA^GTCAATGCTAAGTACATCATrGOTCTAGAAACGATrr 

GTCCGTTAACAGNAATrOGTCnTNT 

TrAAAANTGTGTAAACCAAATACOGNAGTTTAANGCATnTrATAATTGGGGT 

ACAGTATAATGATCCA.\TAACACNGNC 

ATTANAAATAOTGAAAGAACCNCCGGTCATATCTTACAAAGTCAATITACNAT 

TTCT GGCrn NITACNCAAAITANANA 

TrrCCTTTTNAATA 



>RNR1 (38) 2562bp in-house: 1-2562 

ATCJTATGnTATAAGAGAGATGGCCGTAAAOAGCCAGTACOnTCOACAAAAT 
CACTGCCAGAGTTCAAAGATTATGTTA 

CGGnTGAATCCAAACCACGTTGAACCAGTTGCTATTACCCAAAAAGTrATATC 
AGGTGnTACCAGGGGGTTACTACTA 

TTGAGTrGGACAACrrGGCTGCAGAAATTGCTGCTACAATGACAACAATTCAC 
CCAGATTACGCTGTCTTAGCCGCTAGA 

ATTGCCGTATCAAATTTACATAAGCAAACCACCAAACAGTATTCCAAAGTGTC 
TAAGGATTTATATGAATACATTAATCC 

TAAGACTGGGTTACACTCTCCTATGAnTCCAAGGAAACCTACGACATCATrAT 
GGAACACGAAGATGA.ATTAAACTCAG 





.113- 



EP 0 982 401 A2 

CCATTOTrrACGACAGAGATTTTAACTACAATTATTTTGCKnTCAAGACnTGG 
AAAGATCATATTTGTTACGTATCAAC 

GGTAAGGTTQCTGAAAGACCACAACATTrGATCATGAGGOTTGCTOTCOGTAT 

TCACGOTAATGATATACCAAGGGTCAT 

TGAAACCTATAACTTGATGTCTCAAAGATTCTTCACCCATGGTTCTCCTTGTITA 
TTTAACGCTOOTACACCAAGACCAC 

AAATGTCCTCATGTTrcrrGC TrGCT ATGAAGGATGATrCTATrGAAGGTATTT 
ACGACACnTGAAATCGTGTGCTTTO 

ATCTCAAAAAGTGCTGGAGGAATCGGnTACACATCCACAACAITCOTrCTACC 
GGTGCrTACATTGCTGGTACCAATOG 

TACTTCTAATGGTATTATTCCAATGGTAAGAGTATTCAATAACACTGCACGTrA 

TGTCGACCAAGGTGGTAACAAGAGAC 

CTGGTGCCTrrGCCTTGTACTrAGAACCATGGCACAGTGACATrnTGATTTCA 

TTGATATTAGAAAGAATCACGGTAAA 

GAAGAAATCAGAGCCAGAGATTTGTTCCCAGCnTGTGGATTCCAGATTTGTTC 
ATGAAAAGAGTTGAACAAAATGGTGA 

CTGGACrrrATTCrCACCAAATGAGGCCCCAGGCTTGGCTGATGTTTATGGTGA 
CGAATTCGAAGAATTATACACCAAAT 

ACOAAAAAGAAAACCGTGGTAGACAGACCATCAAAGCTCAAAAATTGTGGTA 
TGCTAnTTGGGAGCCCAAACTGAAACA 

GGTACCCCATITATGTTATATAAAGATTCATGTAACAACAAATCCAACCAAAA 
GAACTTGGGTATTATCAAATCTTCCAA 

CTTGrGTrGTOAAATrGTTGAATATTCTGCTCCAGATGAAGTTGCTGTrTGTAA 
CTTGGCTTCCATTGCCrrGCCATCAT 

TTGTTGAAAATGATGAAAAAAGTACTrGGTACAACnTOACAAATTACATCAG 
GTCACTAAGGTTGTCACCCGTAACTTG 

AACAGAGTTATrGACCGTAACCATTACCCAGTCCCAGAAGCTGAAAGATCAAA 
CATGAGACACAGACCAATTGCnTGGG 

TGTrCAAGGTTTGGCTGATGCCrrTrATGGAATTGAGATTACCATTTGACTCTCA 
AGAAGCTAGAGAATTOAACATTCAAA 

TlTITGAGACTATCTACCATGCTGCTGTrGAAGCTrCAATTGAATTGGCTAAAG 
AAGAAGGTGCCTACGAAACCTATCCA 

GGTTCTCCAGCCTCTCAAGGTrTATTACAATTTGATTTGTGGAACAGAAAACCA 
ACTGAATTATGGGATTGGGATACATT 

AAAACAAGATTTGGCCAA.ACATGGTATGAGAAACTCCTTGTTGGTTGCACCAA 
TGCCTACTGCTTCCACATCACAAATTT 

TGGGTAACAATGAATGrnTGAACCATACACTTCTAACATTTACTCTAGAAGAG 
TATTAGCTGGAGAATTCCAAATTGTC 

AATCCATAnTATTGAAGGACTrGGTTGAnTGGGTGTCTGGAACGACGCTATG 
AAAAGTAGTATTATTGCTAACAATGG 

1TCTATCCAAGCCTrACCA.AACATCCCTGATGAAATCAAGGCATrGTACAAAA 
CTGTCTGGGAAATCTCACA.AAAACATA 

TTATCGACATGGCTGCTGATAGAGCAGCATTTATrGATCAATCTCAATCATTAA 

ACATTCACATCAAAGATCCAACAATG .^.o^^- 
GGTAAATTAACCAGTATGCACTrCTACGGTTGGAAGAAAGGTITAAAGACTGG 

TATGTACTACTTAAGAACACAAGCTGC 

CAGTGCTGCTATTCAATTTACCATTGATCAAAAGATrGCTGAGACTGCCGGTCA 
TACGOTTGCAAACTTGGACAAATTAA 



2b C<^"^0 
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ACATTAAGAAATATGTTAACAAAGGAAGACnTGAGAGTGAGAATACCAGTGAT 

GCTCCATAC AAGTCACCATCAACCGAA ^ . . , ^ 

CCAACCTCATTAGAAAGTrCAOTrGCTGAnTGAAAATAAAAGATGAAGGTOA 

AAAGCCAGCTOAAGACAAAACCATTGA 

AGAACTCGAAAATGACATTrATAGTGCCAAAQTrATCGCATGTGCTATrGATA 

ATCCAGAATCTTOTACAATGTGTTCTG 

GT 



>SAM2 (36) 1 ISSbp in-house: l-I ISS 



ATGACTACrrCCAAGGAAACTrrCCTTrrCACTrCAGAATCCGTTGGTGAAGGT 
CACCCAGATAAGATTTGTGACCAAGT 

CTCCGATGCCATTITAGATGCTrGTITAGCTGTTGATCCATrGTCAAAAGTrGCr 
TGTGAAACTGCTGCCAAAACCGGTA 

TGATTATGGTrnTGGTGAAATTACCACTAAAGCTCAATTGOATrATCAAAAAA 
TCATTAGAGACACCATTAAACACATT . 

GGrrACGACGATTCTGAAAAAGGTnTGATTACAAGACTTGTAACGTCTTGGTr 

GCAATTGAACAACAATCTCCAGATAT . 
TGCTCAAGGTTTACATTACGAAAAAGCTrTGGAAGAGTTGGGTGCTGGTGATC 

AAGGTATTATGnTGCTTATGCCACCG 

ATGAAACCGATGAAAAATTGCCATTGACCATnTATrGGCCCACAAATTGAAT 
GCTGCCrrGGCTTCTGCCAGAAGATCA 

GGTTCCTTGCCATGGTTGAGACCAGATACCAAAACCCAAGTCACCATCGAGTA 
TGAAAAAGATGGTGGTGCAGTTATCCC 

AAAAAGAGTCGACACAArrGTTATrrCCACTCAACATGCCGAAGAAATCACCA 

CCGAAAAnTGAGAAAAGAAATTATTG . ^ , 

AACATATCATCAAGCAAGTCATCCCAGAACATTTATrAGACGACAAAACTATC 

TACC ACATTCAGCCATCAGGCAGATTC ^ . , ™^ 

GTCATTGGTGGTCCCCAAGGTGATGCTGOTTTGACTGGTAGAAAGATCATTGTT 

GACACCTATGGTGGTTGGGGTGCACA ™..^«^«^ 

TGarGGrGGTGCCTrCTCAGGCAAGGAnTCTCCAAAGTrGATAGGTCTGCTGC 

TTATGCCGCTCGGTGGGTTGCTAAGT 

CGTTGGTGACCGCCGGATTGGCCAAAAGGGCCTTGGTGCAGTTCTCCTATGCTA 

TTGGGGTTGCTGAACCCACCAGCATT . ^ . * xo- 

TATATAGACACCTATGGGACATCTAAATTGAGCACCGAAGCCCTTGTAGAAAT 

TATC AAGAATAAmrOACTTACGCCC 

TGGCGTAATTGTAAAAGAATTAGATITGGCTCGTCCTATTTArnTAAAACCGC 

TTCTTACGGACATriTACTAACCAAG 
AAAATTCTTGGGAACAACCAAAAAAATrAAAAnT 



^'3 



27 



-115- 



EP 0 982 401 A2 



CCTG24TilfeTCTT2uACC(;TAGATU(XUlUmATOT^ 
GTAaACGAQGAGTCrrTTTCACUAUaACTCCTAUmATGAATCTAGTTT^ 

AATUrCCA«UAAACTCAATrrTCCTUTTQXU CTTCTCOCAgrCC TTA^^ 

CTSuaWACTACCTGCCATCATATtaCTAUT^^ 
TTAAmAAAaaCATATTCTCCCAiTAATACTTTTTTCmCAACTTAT^^ 
iTTCCUTTTSTACTCTCTTA<nTaXaCArcAGT(^^ 
CTGACUAACTOUCACaACCCACTCUTACKAmTACATATTTICAT^^ 

TtXrcaAATCrrcAOGnaAATAnACTTTAACCATCAATGAACAACTACOGCAAAC 



2i 
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XX t9 

1 HSrVW^TPKS aSTXXRAPAF CrELEFSiQG SSOGAISKAA L^V^/FS^/DN 
XXX R 
31 QSrVLIKOLA :<Y%'Gr?SSYQ Lr^KWHCAN IBXSQIUCTO KWLSKttrE- 
101 DLISEA0TK3 ClFYIStPLV YSRISWOPTr WlRfiPrQWC VSXAMCEKP 
ISl ASWAASSDD r>JLOMESDE MGELSXGVX HNKKSMPKYI 

201 K3DRVTIGQV TAsiYOlOPST FLTHSlFNil NSMSXINYVK NFCVSOVRTL 
251 PN-SKLSVASR 21V1MANN»: DMHZr,'BKTE3 KPKXSFRXPr OKSKXMNLQZ 

fa 7 

3C1 DFNSIOLSSS •/:?flCGFI?I> FSIKM^CKV? NVr/TSHHQS LPLS?yrKKL 

X 

3 51 NATSWSSYtF ;C:n,t<r!tSKS igKLVTMSOT rNYW-rKYTi* WrYRGPOSO 
4C1 IJTKOQKliCJX MKiKLSSKK K?FKJ«»?.'SN NKH-fNKSLKO LVHESTDKM? 
«S1 VSYIXS2QRK yTZSYSNLE; LHHStSFKVu L:rryaCV*QB TWHNYVXrXi 

X «■ 

s m 

501 IDFEQLKALQ HSA-VS-ESSK wDAA5HQgWi^ ESEK^P.QERL SLWBDBPilE 

XX !^ X * X XX 
» « « • • • »» 

551 FECLQSSrSQ PJOXIEEKLR filCLEAStSD SFEADSENDS BSELAQIQQD 

eol rssslN-AtKr"K?£Ar^^^^^ mpa?f?c?:e tpqidinnk? 51pt\wei: 

missing saq*;«r.ce 

missin? 
7:1 ATISGVJXO 
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ISSl 



£« 3 

* ■ 

I ^rrSQQTQVX MPSG«GCHG HYQCCOOYHA VO?>?PQOGY 

Al&biQUiti0d 
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51 ymqpvGWj YyC'?:cc*M? r^'A'aQQPB^G ctcsclxscl aal-cvcctld 

as 

101 KLF 
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£ t t 

1 MRRREIERRH KJtXRSQRQK EHSAI310ZRI QQLSEQDSKS WQTKXSEXVF 
51 KSARSTOSOA DSTCLKSDKB FDDSAYSPOV LFaWLWNXP KHSWKKCTK 
101 KYTESV/CWL. 05?x:OtSAY NSSTHDBTNI CKEIQI^SNO BWPQIKOTS 

K D VR ffi C 

151 SVMKrriPA-; ?*r;2SlSTSS NKRR-XFETAO VOVXOLDSPX XAQTRNIVKI 

P 

s 

201 aVSCNP>ftr/Y r?>MC3C?.LE? PEGKUrCROQ 
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1 rTDFSO?KTT KlPiwXEuSI LKRCYrCKCZ, LN^VyTQCO HnCSQCIRB 

» 
m 

51 FI^LKKWCPL CKir/rBSGL KHDFU^EZn' ISVASWPHL LRLLSISKVK 
ICl SKCSVr^EW ANESALt^SNlR N'/IWDVDBT/ RVXOOCMXDK tCEEKGCAOK 

3 ffi X 
151 ";/2Q';»l£aTC2 vrtLLSDOSS KGSDSbyKC? 2Cr23W*:.DV LCCKKIOOCL 

m 

20X SCKSTKRTPT Cr:»5?KAK^F XCITSmC?? IDTXTP5PPT SKASTTFTAt 
SON Z K M 

mm a s s « 

251 PCTTLLSyCtV AS?Sr/AQ$T '/HKOXFLPKL OrSSOSTQKI KAKLSDCJCL? 



201 TCGSBIIJMEA PVtA-ir/IVK A2n4)SKH3\' 



f'3 



1 vQ?sSAw*VLS A'/AGSAlAAy SNSrVTOlOT r.*/ri?SCS2 NKCHSTr.TT 

51 Gv^r/TS\X'C r/rrfCPLST 7sa?aps?a? 5vsTr/^/?iT scesdkchs? 

101 AVTOWTWr EC-?rr\TTYC Plf3?EA?GP APSTA2ESK? ASSSrVPTTA 

151 AESSPACTTA AESSPAOETT PKTVAAES33 ASTTAPAVST AEA^5AAAMAV 

201 PVAAGL'^LA kl? 
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1 ?PX^/AJCS:<33 rrOKIFRYTF YTAVISVIGS AGLIGYKIYE BSQPVCQVKQ 

X 
s 

51 TPLr?rCGSKK KTlVICGSGW GAISLIKKLO TTlWA'IVS PWIYFLFTPL 

fs X fa 

3 m m m 

101 :,?SVPTCCV£ L?.5IIE?VRS VTRRCPOQVr XLEASXT^VH PKTNEtW.Q 

151 S'TTr/SGHS^ sXSSSKSTJ AEYTOVESIT TTLMVCYLW CVGAaTILXF 
XXX XX XX X 

m m m mm ss « 

201 GNKRaMRK? .V??F2X3:?SG SKWIR 



AssaSLaaiil 



1 DQKNEOFXPG Ti:;iV3L2VD SBDESr/SHYD ASSRPXVKTK GtiZlLF^QPS 

51 NSCNDPLTWS JC.-S.<LS^I?ri VIFITAFTAA TSNPAOSXQD SLUElCYaiSY 

lei DAlttJTGAm nOIGKGTFF LTPASSLYOR KITYFICZFL OLLGAVWFAL 

151 '/KSrSD5IW3 ;Lr;«ISESC AEAQVQmSIS ELYFAHNLGS VLTSYIVATS 

201 ''G7Yli3?LlA A?:v;n1G?;% WVOWIAAIIS CAuLFVIVFC ldetyfcrak 

251 FTKP 



3S- 
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1 D?r;ssTS7AE ;;*.T;:siK-vro s^pqesoaht SLsoKr/sAV IvIIxmcjli 
51 A?ccrvFG?3 rsTisGriic^ s2?:.SRrGGT kaoqtlyfsw '-'^tclmxol? 

K X X 
mm s 

ICl KAGCMOAL? :.Sr#'3:;MVGK R^'GI5rrA*ltV TITCII'/^IA SCHAWQVWI 

X 
m 

251 GRI:T<2LA'A; Mir^^CFLfr SEVSPKKLRO TLWfQtMl TLOZFIOYCT 

201 r/GrxsYSDS sqs?ii?u;lc favalclvag ef*n«PSSrRY lvcxorisda 

251 KHSLAXiMr/ SF2v?AlYR2 LCLIQAG'^SP. 2RLAGK^S>X? rLrJGKTKI? 
3Ci BRVTCS'.'M-Q Ai;:?SWG!W LFP5YLTSXP K j 



i SArvSGTITS FLVDVDArv-S VC2E:rKMfiE GDAPAGCASA SEA?AKXEEA 
51 PSKAX28SA? AA^.5K>:S£?K KZBFK:<23KP APKK2ESKXS TSSXTSAPT? 

101 T2*TSP-S££RV XJC:?::?.LaIA ER»S5QNTA A3LTTcKF/D kshiadfp^ 

missing sequence 
:51 v:<DePISXTG :Xt-,F«C.AFS XASAIALJISI rA-^NAAIS^-JI Dr^VFKDYAO 

201 ISIAVAT?XG :./T?V,-a:«2 SiSILOIEtlE ISKIOXXMS OXLTLBOHTG 

s X XX 2 c X :f X F xr x ix 

- s ■ 3 m mm m m 9^ m mm 

251 C7FTISNGGV fC-Sl/rT?:: liXFOTA-TuL K0\'X2a?'*'T*-' MGCIVSll?>04 



301 T >ALTYOHRV V:^"?2,nV:FL RTZXZLICP RXyU 
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(54) Drug targets in Candida albicans 



(57) Nucleic acid molecules encoding 

polypeptides that are critical for survival and growth 
of the yeast Candida albicans are disclosed. Also 
provided are methods of identifying compounds which 
selectively modulate expression or activity of such 
polypeptides comprising the steps of (a) contacting a 
compound to be tested with one or more Candida 
albicans cells having a mutation in a nucleic acid 



molecule according to the invention which mutation 
results In overexpression or underexpression of said 
polypeptides In addition to contacting one or more wild 
type Candida albicans cells with said compound, and (b) 
monitoring the grov^h and/or activity of said mutated 
cell compared to said wild type; wherein differential 
growth or activity of said one or more mutated Candida 
cells is indicative of selective action of said 
compound on a polypeptide or another polypeptide in the 
same or a parallel pathway. 
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1 5rAVYKR2X?.^:< ArVRFOKITA PMiVSPVAIT QKVZS^r£QQ 

31 VrriELDXlA ASZAATM77I KPD?AVLAAR ZAVSIJIiHXQT TKQrSKVSKS 

101 LYSYINPKTG IHS7MISXST TOIIMBHSOB L:iSAI'/Y!>Rr) FNVNVPOFKT 

151 :,£RSyLLRlN Gr/ASRPQKL IMPVAVSIHC KOIPR'vlEr/ NLMSQRFFTH 

201 GSPCLFNA'5? F?.?QMSSC7L LAMKODSIEG ITOTLKSCAL ISKSAGOIOL 
251 YIASTNGTS;-? CIIPM^rBVPN NCARYVDOGtS WKRPGAPALY 

3 CI LEFWKSDtF!: r IDIRKNHCK S2IRAR0L?P ALWIPDLFMK R'/HQNGCWTL 

351 FSPNEAPGLA CV^;GDE7BEL YTWEKSNRG RQTI5CAQKLW YAILGAt^TBT 

401 OTPFMLYXDS C:zniS:^QKKL GriXSSNLCC Sr/SYSAPDE VAVCNWSIA 

451 LPSyV2NX33K STS-YXFCKIiH QVTXV-i/rPJC :«lvrD5N«iY? VPEABRSKMR 

501 KRPIALGVQG lATAFMIlRL P?DSQEA33L KIQXF2TIYII AAVSASISIA 

551 XSSOAYSr/P GS?ASCCLLQ fDLWNRKPTE LWOWDCLXQ^ LAKHCnWSL 

601 LVAPKPTAS? SSIlONraSCF SPYTSK1Y5R RVLAGEFQIV NPYwLKDtVO 

651 wrx^:DA^!^:s sirAi^KGSZQ Atp:^:?DfiiK al/ktvweis qkhiidkaad 

701 RAAFraaSQS LTCHrKDPTS GXLTSIIHFYG WKXGLKTCMV VUlTOAASAA 

751 IQiTIDCKIA £TACKTV?J;L CKl^rXKr^-M XGRVSSSNTS 0APYKSF3TS 

8Cl PTSLSSSVAC LX-IOEGEXF AEDXTlHEIiZ TOIYSAiCVIA CAISNPESC? 

851 MC5G 
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1 H2TSX2TFLF 7SESV32GKF DXICDQVSDA Ili>ACLAV'Df LSKVACSTAA 

51 KTG!{:>r/5«3S ZZrLkQUO'(Q XIZF-DTIW: •T/DD^EKOFD YOTCMVt-Al 

101 EQQSFOIAJG Lt-IYSKALSSL VATDSTrEW. PiTICLAKKL 

151 :%'AALASAR5S 3S:.?WLR9Cr KTQVTJSYSK DGOAVIPKRV trtlVISTJKA 

201 ESIT7SML»;< ET-KIIKOV IPEHuLSOKT lyKIQPSQRF VI0GPC<3CAfl 

251 LtORKIlVt? V^^jAKGOG AFSGKOPSRV DRSAJIVJUUIW VASCSLVTAGL 

301 AXaALVCFSY ATCVASP^SX J?*ALVS:iK NKFDCP.PGTv'I 

3 SI \-<2t2LA?.?: YfKTASY'JHr TKQEIISWSCP KiaKF 
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